Episode 5: Repeats & Loops
The code for this episode is available here. Loops are such an essential part of programming that I knew I’d have to make an episode on them at some point. A natural musical analogue is musical repeats, so the whole episode came fairly naturally! I thought it’d be fun to have some beats to accompany the piano, so I used SuperCollider for that. That proved to be the most challenging part of the episode, as getting the timing right was really hard.
Tips for Reviewing Research Papers
The NeurIPS 2021 review period is about to begin, and there will likely be lots of complaining about the quality of reviews when they come out (I’m often guilty of this type of complaint). I decided to write a post describing how I approach paper-reviewing, in the help that it can be useful for others (especially those who are new to reviewing) in writing high quality reviews. I’m mostly an RL researcher, so a lot of the tips below are mostly from my experience reading RL papers.
Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research
We argue for the value of small- to mid-scale environments in deep RL for increasing scientific insight and help make our community more inclusive. Johan S. Obando-Ceron and Pablo Samuel Castro This is a summary of our paper which was accepted at the Thirty-eighth International Conference on Machine Learning (ICML'21). (An initial version was presented at the deep reinforcement learning workshop at NeurIPS 2020). The code is available here. You can see the Deep RL talk here.
Episode 4: Live Coding & Jazz
The code for this episode is available here. I had a different idea for the fourth episode, but then I saw John McLaughlin’s tweet about International Jazz day, and decided to do something for that instead. Obviously I’d talk about Jazz in the musical section, but it wasn’t clear yet what part of Jazz I’d talk about. I spoke to a few people and it seemed like a good idea would be to talk about improvisation, and how jazz musicians do it; in particular, I’m hoping this helps people who don’t “get” jazz to understand what we’re doing when we play it, and that we’re not just playing random notes!
Music during COVID-19
It’s been just over a year since COVID-19 forced us all to stay home. It’s been a terribly difficult year for so many around the globe, and I don’t mean this post to minimize the plight of others, but rather simply highlight one of the silver linings in my past year. One of the things I miss most from pre-COVID days is playing live. I had to cancel a show with my jazz trio when things started getting locked down.
Episode 3: Leitmotifs & Variables
The code for this episode is available here. I had it in my head that the third episode would talk about variables in the section about Computer Science. Originally I thought the musical would be about chords, but it didn’t quite fit well with variables. Then I thought about key signatures, thinking that these are kind of like variables in the sense that you can shift any song into different pitches just by changing key signatures; but again, I wasn’t very content with the connection.
Episode 2: Bits & Semitones
The code for this episode is available here. The idea for doing something with bits seemed kind of natural to me as a second episode. After covering what “computation” is, why not cover what computers actually “see” when they run computations? Given that bits are what makes up everything inside a computer’s software, I wanted a musical topic that was inside every type of music (at least in Western music).
Metrics and continuity in reinforcement learning
In this work we investigate the notion of “state similarity” in Markov decision processes. This concept is central to generalization in RL with function approximation. Our paper was published at AAAI'21. Charline Le Lan, Marc G. Bellemare, and Pablo Samuel Castro The text below was adapted from Charline’s twitter thread In RL, we often deal with systems with large state spaces. We can’t exactly represent the value of each of these states and need some type of generalization.
Episode 1: Musical Notes & Computation
The code for this episode is available here. I originally thought this channel would be a kind of educational channel, where people could learn about both music and computer science in a fun and informal way. I tweeted asking for suggestions for what to cover first on the CS side, and Kory Mathewson’s response was my favourite. On the music side, it was kind of a train-of-thought process. The first thing that came to mind when thinking about the first thing you might learn in music theory was musical notes themselves.
A musical ode to musical code. Subscribe to the YouTube channel!. Each episode will explore a topic in Computer Science, a topic in Music, and combine them in creative ways. You can find the code I use for each episode here! The story The reason I decided to start this show was because, thanks to COVID-19, I was no longer performing live with my jazz trio, but I was aching for some type of performative output.
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
This paper was accepted as a spotlight at ICLR'21. We propose a new metric and contrastive loss that comes equipped with theoretical and empirical results. Policy Similarity Metric We introduce the policy similarity metric (PSM) which is based on bisimulation metrics. In contrast to bisimulation metrics (which is built on reward differences), PSMs are built on differences in optimal policies. If we were to use this metric for policy transfer (as Doina Precup & I explored previously), we can upper-bound the difference between the optimal and the transferred policy: