I was born and raised in Quito, Ecuador, and moved to Montreal after high school to study at McGill. I stayed in Montreal for the next 10 years, finished my bachelors, worked at a flight simulator company, and then eventually obtained my masters and PhD at McGill, focusing on Reinforcement Learning under the supervision of Doina Precup and Prakash Panangaden. After my PhD I did a 10-month postdoc in Paris before moving to Pittsburgh to join Google. I have worked at Google for close to 9 years, and am currently a staff research Software Developer in Google Brain in Montreal, focusing on fundamental Reinforcement Learning research, Machine Learning and Creativity, and being a regular advocate for increasing the LatinX representation in the research community. Aside from my interest in coding/AI/math, I am an active musician, love running (6 marathons so far, including Boston!), and discussing politics and activism.
The code for this episode is available here. Loops are such an essential part of programming that I knew I’d have to make an episode on them at some point. A natural musical analogue is musical repeats, so the whole episode came fairly naturally! I thought it’d be fun to have some beats to accompany the piano, so I used SuperCollider for that. That proved to be the most challenging part of the episode, as getting the timing right was really hard.
We present a new behavioural distance over the state space of a Markov decision process, and demonstrate the use of this distance as an effective means of shaping the learnt representations of deep reinforcement learning agents. Pablo Samuel Castro*, Tyler Kastner*, Prakash Panangaden, and Mark Rowland This blogpost is a summary of our paper. The code is available here. The following figure gives a nice summary of the empirical gains our new loss provides, yielding an improvement on all of the Dopamine agents.
The NeurIPS 2021 review period is about to begin, and there will likely be lots of complaining about the quality of reviews when they come out (I’m often guilty of this type of complaint). I decided to write a post describing how I approach paper-reviewing, in the help that it can be useful for others (especially those who are new to reviewing) in writing high quality reviews. I’m mostly an RL researcher, so a lot of the tips below are mostly from my experience reading RL papers.