OCCaM: Olin College Crowdsourcing and Machine Learning blogs
http://occam.olin.edu/blog
enBlag Weeks 8-11: I Heart Gauss
http://occam.olin.edu/node/29
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>The team gained a greater appreciation for Johann Carl Friedrich in the implementation of Gaussian Process Regression with basis functions. GPR was chosen to replace LWPR because of LWPR’s failure to generalize outside of the state space already explored. The inclusion of basis functions in GPR will allow global semi-parametric modeling of system dynamics with Gaussian Processes modeling the local residuals. We adopted the code and book from <a href="http://www.gaussianprocess.org/gpml/">Gaussian Processes for Machine Learning</a> by Carl Edward Rasmussen and Chris Williams. </p>
<p>Subhash and Mike have been working on the implementation and evaluation of GPR for system identification.</p>
<p>“Definition 2.1 A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution.” It is described entirely by a mean and covariance function (GPML p. 13). Gaussian Process Regression is not a new algorithm, but has been used successfully in robotics dynamics modeling. A Gaussian Process can be thought of as defining a distribution of functions, with inference occurring in the function-space rather than the weight-space of standard parametric regression (GPML pg. 7). Optimal predictions may be made by using Bayesian inference, with tuning of hyperparameters to maximize the posterior log marginal likelihood with respect to the hyperparameters. The theoretical basis for GPR with basis functions was presented in Ch. 2.7 of GPML. However, it did not describe the optimization process with the use of basis functions, nor did the package include basis functions. Mike did the math to find the derivative of the log marginal likelihood w.r.t. the parameterization of the basis function covariance matrix and is within striking distance of completing the integration of basis functions with the GPML package. Subhash has implemented GPR with basis functions separately from the package, and has verified that GPR gives an acceptable model for the single pendulum swing up with generalization outside of the sampled space, which was the needed improvement over LWPR.</p>
<p>The Control Theory aspect of the project was not on the critical path in this stage, so there are no updates there, besides verification that the model learned by ridge regression was sufficient to complete a single pendulum swing up when given to iLQG.</p>
<p>Deniz continued to refine the MATLAB-Python-ROS-Gazebo communication pipeline, and made improvements to experimental pipeline. Now with procedural generation of ROS files to create simulations, sweeps of the parameter space for system dynamics can be done, with clear implications for multi-task and lifelong learning. For instance, a double pendulum’s mass and lengths of links and friction of pivots can be varied. Then, simulations with SysID can be run to find the system dynamics with a wide range of underlying parameters. These models can inform a multi-task or lifelong learning system based on the relations between parameter values and the dynamics models returned. </p>
<p>The plan for the upcoming semester is to integrate GPR+basis functions with ELLA. SysID and model learning transfer will be the theoretical focus, with support from ROS and possibly physical experiments. </p>
<p>Overall, this summer research team has gained the foundational skills and knowledge and built the software architecture and experimental pipeline needed to make advances in the Lifelong Learning space with applications to Fault Tolerant Control.</p>
</div></div></div>Thu, 14 Aug 2014 20:23:12 +0000mbocamazo29 at http://occam.olin.eduhttp://occam.olin.edu/node/29#commentsWeek 6: Working Memory
http://occam.olin.edu/node/28
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>This week saw a big change of direction for the project. With the higher capacity working memory model completely debugged, I expected the beginning of this week to be data analysis, before moving on to a bigger, better model. But the results we saw made us reconsider our entire approach to the problem of serial recall phenomena.</p>
<p>The working memory content never surpassed two items, even with the theoretical capacity of four. How did this happen? It was due to the probability of spontaneously forgetting any stimulus. Consider the following situation:</p>
<p>At state s, the working memory content consists of [stimulus 2, stimulus 3]. Stimulus 1 is presented, and the agent decides to replace the third item of working memory. To unpack the situation, we can take a look at the possible outcomes and their respective probabilities.</p>
<p>So what are the possible outcomes? Well, any one of the 6 stimuli might be presented next, and there are three possible working memory states: the agent might not forget anything in which case the working memory would contain [stimulus 2, stimulus 3, stimulus 1], or either of the original memories might be forgotten, in which case the working memory would consist of either [stimulus 3, stimulus 1] or [stimulus 2, stimulus 1]. That’s a total of 18 possible outcomes. But, let’s say the probability of the same stimulus being presented twice in a row is 0. Now, we have 15 possible outcomes. The model assumes that each of these scenarios has an equal probability. In this case, the possible outcomes all have a probability of 1/15. This gives a ⅔ chance of having two items in working memory in the next state.</p>
<p>The fact that only two items are ever stored in working memory made it clear that it was time to reevaluate our initial assumptions about memory loss. Especially given the fact that evidence as to whether working memories decay temporally is mixed (Peterson and Peterson 1959; Oberauer, 2008). It could be that memory loss in such tasks is primarily due to replacement. Additionally, as the model is now, you are more likely to forget a given stimulus the longer it is stored in working memory, because of the cumulative effect of forgetting probability at each time step. This is not necessarily reflective of actual properties of working memory. We needed to rethink forgetting probabilities.</p>
<p>Paul and I arranged a meeting to discuss exactly this. Aside from designing forgetting probabilities to generate the primacy and recency effects, there was no clear way to find logical values. Furthermore, serial recall may not be a fruitful area to apply optimal control theory because the concept of optimizing performance is not meaningful in this context. Success in this task is independent of which memories are retained, because as long as the agent fills its working memory, performance will be the same. It makes more sense, on the other hand, that working memory might be optimized to natural human behaviors like language processing. Phenomena like primacy and recency could be byproducts of an optimal solution to a different task. We were left with two options: moving to a dual store model of working memory, or ditching serial recall tasks and using a more behaviorally relevant task as a model.</p>
</div></div></div>Tue, 22 Jul 2014 22:38:55 +0000gabrielle.e28 at http://occam.olin.eduhttp://occam.olin.edu/node/28#commentsBlag Weeks 6, 7: A Hill Too Steep to Climb
http://occam.olin.edu/node/27
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Sometimes models don’t generalize well. The team’s path met a gradient too great to climb: LWPR could not generalize the local models (built with receptive fields) to broader regions of the state space for use in the controller. This makes sense, in hindsight, because LWPR is an efficient high-dimensional function approximation approach. It performed worse than regularized linear regression on the forward model identification task, with and without additional basis functions. The new plan is to use Gaussian Process Regression for function approximation locally and a more global approach for regions that have not yet been explored. Subhash is currently working on implementation of GPR.</p>
<p>Ideally, an InfoMax controller would drive the system to regions that have not yet been explored so that the model improves everywhere. If information gain is quantified as a negative cost, as in KL Divergence, exploratory behavior will emerge, indirectly adding a learning component to the controller. However, this must be contrasted with the normal behavior of the the controller. The controller should drive the state to the most relevant configurations, which may in turn inform the model in the most useful ways. So the benefits of directed information pursuit must be weighed against the normal behavior. Mike is reading into different information pursuit procedures, and just finished restructuring the initial SysID and example control code for inclusion in the interlab repository.</p>
<p>ROS and Gazebo communication with Python and MATLAB has been implemented. Scripts capture the state variables of the system and save them to a script repeatedly. Because we do not expect our SysID and Control algorithms to run in real time, at least at first, we’ll have the simulation repeatedly pause, pass the information to SysID in MATLAB, which then passes the model to the controller, which creates an open loop control signal based on the model. This control signal then drives the system in simulation for the next period. Hopefully, the learned model and controller are accurate enough to achieve the task (in the first case, pendulum swing up). Integration between the subsystems is now completed, left is improvement of the individual pieces. Deniz is now doing preliminary work on ROS-ARDrone communication and refactoring of the communication code for export to the other labs in the Lifelong Learning project. </p>
<p>With greater structure in the SysID code, and the integration complete, we should be able to turn around an analysis on the efficacy of GPR much faster than before.</p>
</div></div></div>Tue, 22 Jul 2014 18:37:12 +0000mbocamazo27 at http://occam.olin.eduhttp://occam.olin.edu/node/27#commentsWorking Memory: Week 5
http://occam.olin.edu/node/26
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>I got a head start on this week by finishing up a draft of the code for a model with a higher capacity working memory as well as the ability to forget. </p>
<p>Paul and I met for a brainstorming session on Monday, and determined that our primary goal is to create models which replicate human behaviors on serial recall tasks. In a serial recall task, the subject is read a list of words and asked to recall them in the correct order. The test reveals many interesting phenomena, including the primacy and recency effects, wherein subjects tend to remember the beginning and end of a list, but not the middle. </p>
<p>Additionally of interest is the Hebb effect, which appears during a task wherein the subject is read aloud lists of numbers and asked to recall each list after it is read. If every third list is the same, the subject’s performance on this list will gradually improve, even if he/she is unaware that this list is being repeated. </p>
<p>We decided on the initial goal of replicating primacy and recency effects in serial recall tasks. </p>
<p>Before I could get to work on debugging my recently drafted code, and adjusting it to model serial recall tasks, I had to look to something more pressing-- a presentation! I prepared a presentation for the summer research talks and practiced at our weekly lab meeting.</p>
<p>Later in the week, I worked on debugging the transition probabilities for forgetting. Next week, I plan using my model with a higher capacity working memory and probabilities of forgetting to generate graphs of how recently the stimulus was presented versus the probability of getting the answer correct.</p>
<p>Cheers!</p>
</div></div></div>Tue, 08 Jul 2014 21:20:57 +0000gabrielle.e26 at http://occam.olin.eduhttp://occam.olin.edu/node/26#commentsBlag Weeks 4, 5: The Implementation Battle Progresses
http://occam.olin.edu/node/25
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>The Lifelong Learning team is closing in on the SysID-Control-Simulation problem.</p>
<p>Research into the literature for control theory and system identification included Linearly Weighted Projection Regression, Sparse Online Gaussian Processes, derivatives of Partial Least Squares such as Direct Kernel PLS, and ILQG again. For our planned experiment with fault tolerant control, it is extremely important that the learning be incremental and converge quickly to a model usable to by a controller, so LWPR and SOGP were of interest. However, these more complicated algorithms ought to have been reserved until things such as regression on the system with the state variables + control signal + kernel trick + trigonometric functions was implemented as a basic test (completed at beginning of Week 6). Swing-up of a single-actuated double pendulum through ILQG was completed at the end of Week 5.</p>
<p>Going into Week 6, the integration of the three subtasks needs to be completed as soon as possible. Hopefully, through collaboration with other labs, any issues with ROS and Gazebo can be resolved. </p>
<p>The planned experiment is expected to be highly informative, and demonstrate the failure points of our approach. We should then be able to plan or at least sketch out 2-3 subprojects for the remainder of the summer to attempt the integration of Lifelong Learning and Fault-Tolerant Control.</p>
</div></div></div>Tue, 08 Jul 2014 13:24:51 +0000mbocamazo25 at http://occam.olin.eduhttp://occam.olin.edu/node/25#commentsWorking Memory: Week 3
http://occam.olin.edu/node/24
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>This past week I worked on an my implementation of Q-learning, a method of reinforcement learning which converges quickly and does not depend on any knowledge of the environment. I plan to use Q-learning to model working memory, where the agent learns whether to replace or retain the contents of working memory based on past experience. </p>
<p>On Tuesday, I presented on the concept of Q-learning during lab meeting and discussed ways in which Q-learning might help with modeling of working memory. One possible drawback of this application is the fact that Q-learning has no initial assumptions about a system whereas it seems reasonable that working memory might employ certain assumptions about the environment. Since we have chosen to model the learning process this might cause some discrepancy between the behavior of our model and the behavior of working memory. </p>
<p>Before I begin working on the working memory model, I want to ensure that my implementation is working properly and develop my intuition for Q-learning. I finished implementing a more generalized Q-learning algorithm and applying it to a random walk Markov decision process to confirm the algorithm's accuracy and build my intuition. For an example of a random walk process, take a look at figure 6.5 in this <a href="http://webdocs.cs.ualberta.ca/~sutton/book/ebook/node62.html">link</a>. In my program, there are 5 states, of which either end state (states 1 and 5) is terminal. The rightmost state (state 5) has a reward of 1; all other states have a reward of 0. The agent begins each episode at state 3, and moves either left (action 1) or right (action 2) until reaching a terminal state. With each action taken, the Q-value for the state, action pair is re-evaluated according to the following formula:</p>
<p>Q(s,a) = Q(s,a) + α[r + γ Q(s',a'*) - Q(s,a)]</p>
<p>where a'* represents the best possible action at the next state, s', α represents a learning rate, γ represents a discount factor on the future reward, r represents the immediate reward upon reaching the next state, and Q(s, a) is the estimated value of a state action pair. Several episodes are performed and the algorithm repeatedly updates Q(s,a) values.</p>
<p>Given this formula and a high gamma value, we can expect that taking action 1 (left), from state 2 would guarantee zero future reward and hence Q(state 2, action 1) would equal zero. Similarly, we can expect that Q(state 4, action 2) would equal 1, because taking action2 in state 4 guarantees a reward of 1. </p>
<p>After much debugging, my algorithm outputs results that look reasonable:</p>
<p>('s_1', 'a_1') 0<br />
('s_3', 'a_1') 0.727317242186<br />
('s_1', 'a_2') 0<br />
('s_4', 'a_2') 1.0<br />
('s_2', 'a_1') 0.0<br />
('s_4', 'a_1') 0.809658069053<br />
('s_3', 'a_2') 0.9<br />
('s_5', 'a_1') 0<br />
('s_5', 'a_2') 0<br />
('s_2', 'a_2') 0.809523722886</p>
<p>Once the Q-learning algorithm was confirmed successful, I began to work on the first iteration of the working memory (WM) model. </p>
<p>This model is based on the experiments of Collins and Frank (2012)1, in which a subject is presented with a series of stimuli (pictures) and asked to respond by pressing one of three keys for each picture. Upon their response, a tone indicates whether they are correct or incorrect. Stimuli are repeated so that the subject can learn the proper responses. </p>
<p>In our first model, we consider a system with 2 stimuli and a working memory capacity of 1. Alpha represents the probability of the presented stimulus switching between trials. A state is composed of the stimulus presented and the stimulus in memory. At each time step, the agent selects between 2 actions: to maintain the current contents of working memory, or to replace the contents with the recently presented stimulus. The agent will learn an optimal policy depending on the value of alpha.</p>
<p>On Friday I met with Paul and team Lifelong Learning-- Mike, Deniz, and Subhash. Subhash presented on System Identification, a field which examines techniques for building models from interaction with an environment. System Identification is particularly useful when dealing with time variant systems, that is, systems whose output is explicitly dependent on time. I speculate that System Identification might be an interesting way to approach the question of working memory's versatility. Perhaps implementation of system identification techniques might explain how working memory is able to perform diverse tasks like reading comprehension, mental math, and list memorization.</p>
<p>Next week I hope to finish this first iteration, and create a second iteration with a higher memory capacity and an ability to forget encoded into the transition probabilities. I plan to analyze the output of my models and produce graphs in ipython notebook. Until then!</p>
</div></div></div>Tue, 24 Jun 2014 20:16:31 +0000gabrielle.e24 at http://occam.olin.eduhttp://occam.olin.edu/node/24#commentsBlag Week 3: Attacking the Problem
http://occam.olin.edu/node/23
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded">Week 3 saw the first real combat with the software and control problems, and broader computer lab remodeling work. Ten shiny new monitors arrived, and much fun was had playing with setting them up.
<img src="http://occam.olin.edu/sites/default/files/BLAG3PIC1.jpg" height="360" width="640" align="middle">
On the control theory front, Yuval Tassa’s implementation of <a href="http://www.cs.washington.edu/people/postdocs/tassa/code/">iLQG<a/> was chosen over several others for the sake of readability and modularity within the code. Implementation of the entire iLQG algorithm was judged to be a poor use of time, given the quality of existing code. The algorithm is sufficiently robust to accommodate a variety of non-linear control problems. Pendulum swing-up, <a href="http://en.wikipedia.org/wiki/Inverted_pendulum#Inverted_pendulum_on_a_cart">the cart-pole problem<a/>, and biomechanical control have been used as proofs-of-concept within simulation, while we hope to apply it to robotic control. Even in these relatively simple problems, Mike has found interesting challenges in defining appropriate cost functions and seemingly minor system dynamics details that have proven important.
In terms of simulation, progress has been made by Deniz in thoroughly understanding the interaction between the ROS, Gazebo and Rviz systems. Working through the tutorials on the respective software’s wiki pages, a more fundamental basis of knowledge was created. Along the way, a few problems arose including a lack of keyboard teleops for the quadcopter stack and broken packages. Originally the plan was to use the <a href="http://wiki.ros.org/hector_quadrotor">hector quadcopter stack<a/> and its included teleop package to fly the craft. However, there was a distinct lack of keyboard controls for the ROS hydro distro which led to an in depth tutorial on how to create custom launch files to control the quadcopter. While attempting the demo simulations in gazebo, a broken/unmet dependency was found and hindered efforts to test out the progress made during the week. This was eventually fixed and progress continued on writing up a document to walk through the other members of the research team in setting up the simulator.
Subhash gave a presentation on <a href="http://www.is.tuebingen.mpg.de/fileadmin/user_upload/files/publications/2011/Cognitive-Science-2011-ModelLearning.pdf">System Identification<a/> (SysID) during a lab meeting. In our conversation, we delved into topics such as the process behind modeling linear and nonlinear systems, using blackbox and greybox methods, various state-action model prediction (forward, inverse, mixed, and multi-step models) and applying learning architectures to developed models. With some more insight into SysID for autonomous robotic systems, Subhash believes ELLA can be extended to learn dynamical models with changing environments.
Gabrielle, another researcher Paul’s lab who is working on models of computational memory, joined us for our discussion of control theory. We decided to look at locally weighted projection regression as a next step in SysID. These should come together for our medium-term goal of integrating ROS, SysID, and non-linear control into a functioning robotic control simulation. We can now drive hard against this well-defined goal and limit the literature review to only what is necessary.
</div></div></div>Mon, 23 Jun 2014 19:24:40 +0000dcelik23 at http://occam.olin.eduhttp://occam.olin.edu/node/23#commentsBlag Week 2: Define the Machine
http://occam.olin.edu/node/20
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>“I was born not knowing and have had only a little time to change that here and there.” - Richard Feynman</p>
<p>Having been inspired by Paul Ruvolo’s work with “<a href="//escholarship.org/uc/item/5v87s6t1#page-64”">Diego San</a><a></a>”, the three feckless “researchers” set down the path of control theory and trajectory optimization, with an eventual goal of extending the lifelong learning framework to <a href="//en.wikipedia.org/wiki/Fault_tolerance">fault-tolerant control</a><a></a>.</p>
<p>Research collaborators at the <a href="http://www.eecs.wsu.edu/~taylorm/">Washington State University</a><a></a> and <a href="http://www.seas.upenn.edu/~eeaton/">University of Pennsylvania</a><a></a> have expressed interest in working with robotic systems, Turtlebots in particular. Shared learning among agents, homogeneous or heterogeneous, within dynamic environments would put ELLA to the test in a way that could build the experimental foundation for the validity of the lifelong learning approach.</p>
<p>It may be useful to sketch a hypothetical experiment: several Turtlebots have search and navigation tasks, and the controller has parameters to learn so that the robot adapts its search patterns to the particularities of its environment. There may be dense indoor environments (possibly rooms with a lot of furniture), sparse indoor environments, indoor environments with many subpaths (hallways), open outdoor or indoor environments, and so on. The controller could encode explored space as a reward itself, separate from finding the actual objective. The parameters of the controller could encode the decision process for exploring a path. To incorporate something like ELLA or PG-ELLA, there would be a finite number of parameters corresponding to the dimensions of the shared knowledge basis. A task could be an entire search mission, or a single decision between branching paths. The new state, possibly up until the next fork, would quantify the information gained from searching the space. This could then be a labeled data point, if there were some relative metric for gained information. With many of these labeled points (which become tasks) the shared knowledge basis is updated, and thus trained. Different environments might have different realized values after making a decision, but could still have an underlying structure as previously discussed, thus it could make for a good problem for ELLA’s lifelong learning problem formulation. Obviously, this is only a sketch, but it can give the reader an idea of how an experiment might be designed. </p>
<p>On the control theory side, the team walked through an implementation of a <a href="http://www.mathworks.com/help/control/ref/lqr.html">Linear-Quadratic Regulator</a><a></a>, and discussed some theory for optimality of the solution. Other topics included the <a href="http://www.cs.unc.edu/~welch/media/pdf/kalman_intro.pdf">Kalman filter</a><a></a>, the <a href="http://www.cs.ubc.ca/~mitchell/Class/CS532M.2007W2/Talks/hjbSham.pdf">HJB</a><a></a> and <a href="http://www.sosmath.com/diffeq/first/riccati/riccati.html">Ricatti equations</a><a></a>, <a href="http://en.wikipedia.org/wiki/Pontryagin's_minimum_principle">Pontryagin’s max/min principle</a><a></a>, the <a href="http://en.wikipedia.org/wiki/Hamiltonian_(control_theory)">control Hamiltonian</a><a></a> and many others. The team also talked about <a href="http://arxiv.org/pdf/1311.2838.pdf">PAC bounds</a><a></a> and encoding hyperbias through choice of hypothesis spaces and families last week in an effort to gain a greater ML foundation. </p>
<p>Fault-tolerant control work could include taking partial system failures as learning tasks, and attempting to learn in real time how to refine a controller to compensate for a system fault such as failure or loss of a quadcopter’s rotor during flight, leading to new system physics and by extension control laws. After deciding that this was a sufficiently amazing application, we set this as one of the main potential goals. To cover more area and speed up the approach to the problem now that we have defined the machine, Mike is reading more about <a href="http://homes.cs.washington.edu/~todorov/courses/amath579/Todorov_ACC05.pdf"> iterative Linear Quadratic Gaussian control</a><a></a> and <a href="http://www.machinelearning.org/archive/icml2009/papers/271.pdf">trajectory optimization with approximate inference</a><a></a>, Subhash is reading about <a href="http://www.control.isy.liu.se/research/reports/2007/2809.pdf">system identification theory</a><a></a>, and Deniz is setting up the necessary hardware and <a href="http://www.ros.org/">ROS</a><a></a> simulation software with <a href="http://wiki.ros.org/gazebo">Gazebo</a><a></a>.</p>
<p>Next steps are to implement more control algorithms (especially with stochastic systems), do control simulations, and design simple system identification tasks.</p>
</div></div></div>Fri, 13 Jun 2014 22:01:28 +0000dcelik20 at http://occam.olin.eduhttp://occam.olin.edu/node/20#commentsEye-Helper Day 1 (06/02/2014)
http://occam.olin.edu/node/19
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Greetings from team eye-helper! </p>
<p>Before we tell you about what we did on our first day of summer research, here's a brief introduction to our project...</p>
<p>Our current goal is to help the blind/visually impaired shop without needing help from in-store personnel. We have been informed that this is a problem area in the blind/visually impaired community, and current solutions to the problem include barcode scanners and emailing a grocery list to the store beforehand. Part of our hope is that this technology could allow blind users to browse and shop at the grocery store at any time and without relying on the employee or needing to plan ahead. However, we have heard multiple opinions from people we have spoken to, and it seems unclear about what features in this technology would be most appreciated. For example, should this technology focus more on navigation and obstacle avoidance, or should this technology revolve around specific grocery item identification on the shelf in front of the user, or something else? We also don't know enough about the current technologies (and their pros/cons, as perceived by people who use them on a daily basis) to design good interfaces for this technology. We hope to talk to members of the blind/visually impaired community to both learn more about their daily lives and gain insight on how to make our technology as impactful as possible. </p>
<p>By the way, we're open source! Our code can be found at... </p>
<p>Today we started working on campus - we'll be continuing the work that Emily and Cypress started last semester (which was mostly focused on learning device communications with android/node platforms. yay sockets!). </p>
<p>In regards to prototypes, we have an android/Google glass application that can capture images and stream them to a webapp (eye-helper.com). We can also communicate to the smartphone by typing into the chatbox on the webapp. This is done through sockets and Google's TextToSpeech API. Screencaps of this can be seen below. At the moment this just has the device communications set up, but object tracking and crowdsourcing have yet to be implemented. </p>
<p>One of our major tasks this week is to learn more about our lovely users! Our project is aimed at the blind/visually impaired, but this needs to be narrowed down a bit. For example, should we design for the tech-savvy blind community? Are there different subsets of our user group that we need to consider? We have a bunch of open-ended questions that we hope to discuss with our users through phone calls and in-person visits before making design decisions about our interfaces. We had our first phone call with a user today - we’ll talk about it in a few paragraphs. </p>
<p>As with any team project, we had to get ourselves organized. We've chosen to create a rolling to-do list and assign people to tasks as we go, so everyone can be heavily involved with each "subsystem" of the project. (To clarify, the subsystems were the aspects mentioned in the description above - in short, the user research, crowdsourcing interface, and computer vision aspects.) </p>
<p>talk about our phone call with other paul... (use pseudonym?)</p>
<p>After having our first user conversation of the summer, we created a people portrait to visualize our notes (it includes quotes, general info, their first impressions/thoughts about eye-helper so far, and our observations/comments about the experience). We will probably be creating similar posters for all of the other users we meet in the near future. This conversation was quite an exciting update - we realized that we definitely need more context on the different subgroups in the blind/visually impaired community and that everyone has their unique traits and lifestyle habits that may or may not make some of the current features/concepts of eye-helper a moot point. </p>
<p>Looks like that’s it for today - stay tuned for daily updates from the research team!</p>
<p>--Emily</p>
</div></div></div>Mon, 09 Jun 2014 21:24:29 +0000lvanderlyn19 at http://occam.olin.eduhttp://occam.olin.edu/node/19#commentsBlahg Week 1: Enter the Machine....(Learning)
http://occam.olin.edu/node/18
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>“If learning is the answer, what is the question?” - Yoav Shoham, Kevin Leyton-Brown, <a href="http://www.masfoundations.org/mas.pdf">Multiagent Systems</a><a></a>. </p>
<p><a href="http://occam.olin.edu/node/2">Paul Ruvolo</a><a></a> has brought aboard three sophomoric “researchers” in Michael Bocamazo, Subhash Gubba, and Deniz Celik to help him with the “<a href="http://jmlr.org/proceedings/papers/v28/ruvolo13.pdf">Efficient Lifelong Learning Algorithm</a><a></a>”, ELLA. Lifelong learning seeks to extend the multi-task learning framework to online learning, in which new data is received sequentially and learning is transferred both forward to new tasks, and backwards to refine previous tasks. </p>
<p>This week our group has read over many relevant papers, discussed the algorithm behind ELLA in depth with Paul Ruvolo, and found many interesting datasets to test ELLA’s merit. For example we found the <a href="http://archive.ics.uci.edu/ml/">UCI Machine Learning Repository</a><a></a> very useful in finding datasets with many features and tasks. We spent a majority of our time reading over papers that cited ELLA and papers within the lifelong learning space in general.</p>
<p>A major research goal is to find compelling problems that will push the boundaries of the current algorithm and further knowledge in the lifelong learning field. As the week comes to a close we are stuck with the problem of finding which problems ELLA is most suited to solve, or better, uniquely suited to solve. ELLA uses a ‘shared knowledge basis’ to encode weights learned on different tasks in an attempt to find underlying ‘basis tasks’ which can be linearly combined to evaluate new tasks. These basis tasks are vectors (of length d, the number of dimensions or attributes of each element) within the shared knowledge matrix. To build confidence in the robustness of ELLA, datasets with high dimensionality, potential for high k (number of underlying basis vectors in the shared knowledge matrix), and many tasks are needed. </p>
<p>Many of the datasets we found were already feature extracted in which case applying ELLA is a much easier task than identifying and extracting our own features. We plan on having at least one data set that we analyze ourselves along with multiple datasets that have been pre-extracted. As of now, we are focusing on the computer vision for our extraction and may be branching out into domains such as education research, multiagent systems, audio processing and natural language processing. Experimentation with <a href="http://www.turtlebot.com/">Turtlebots</a><a></a> and <a href="http://en.wikipedia.org/wiki/Quadcopter">Quadcopters</a><a></a> is also a possibility, in which we would work on control applications of ELLA, introduced in <a href="http://www.seas.upenn.edu/~eeaton/papers/BouAmmar2014Online.pdf">PG-ELLA</a><a></a>. Reinforcement Learning is the basis for these applications.</p>
<p>During a conference call with Paul Ruvolo’s collaborators, we discussed coordination with other researchers and students working on ELLA, so these blog posts may also discuss work being done within the whole project team.</p>
</div></div></div>Mon, 09 Jun 2014 21:22:06 +0000dcelik18 at http://occam.olin.eduhttp://occam.olin.edu/node/18#comments