psc's website
    • Posts
    • Mentoring / Education
      • CME is A-OK
      • GridWorld Playground
      • Intro a Transformers
      • Intro to RL
      • Preparing your resume
      • Tips for Interviewing at Google
      • Tips for Reviewing Research Papers
    • MUSICODE
      • Phase 1
        • 0-Introducing
        • 1-Musical Note & Computation
        • 2-Bits & Semitones
        • 3-Leitmotifs & Variables
        • 4-Live Coding & Jazz
        • 5-Repeats & Loops
      • Introducing
      • Losses, Dissonances, and Distortions
      • Portrait of Hallelagine
    • Art
      • Albums
      • Cost of Beauty
      • Covid Music
      • Family
      • JiDiJi
      • Musical Aquarium
    • Misc
      • Artificial General Relativity
      • Crosswords
      • Origins of April Fool's Day
      • PongDay
      • yovoy
    • Research
      • Other
        • RigL
      • RL
        • 2020 RL Highlights
        • Contrastive Behavioral Similarity Embeddings
        • Dopamine
        • Flying balloons with RL
        • Metrics & continuity in RL
        • MICo
        • Revisiting Rainbow
        • Scalable methods ...
        • SparseRL
        • Statistical Precipice
        • Tandem RL
      • Creativity
        • Agence, a dynamic film
        • GANterpretations
        • ML-Jam
    Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

    This paper was accepted as a spotlight at ICLR'21. We propose a new metric and contrastive loss that comes equipped with theoretical and empirical results. Policy Similarity Metric We introduce the policy similarity metric (PSM) which is based on bisimulation metrics. In contrast to bisimulation metrics (which is built on reward differences), PSMs are built on differences in optimal policies. If we were to use this metric for policy transfer (as Doina Precup & I explored previously), we can upper-bound the difference between the optimal and the transferred policy:

    January 14, 2021 Read
    2020 RL highlights

    As part of TWiML ’s AI Rewind series, I was asked to provide a list of reinforcement learning papers that were highlights for me in 2020. It’s been a difficult year for pretty much everyone, but it’s heartening to see that despite all the difficulties, interesting research still came out. Given the size and breadth of the reinforcement learning research, as well as the fact that I was asked to do this at the end of NeurIPS and right before my vacation, I decided to apply the following rules in the selection:

    December 16, 2020 Read
    Autonomous navigation of stratospheric balloons using reinforcement learning

    In this work we, quite literally, take reinforcement learning to new heights! Specifically, we use deep reinforcement learning to help control the navigation of stratospheric balloons, whose purpose is to deliver internet to areas with low connectivity. This project is an ongoing collaboration with Loon. It’s been incredibly rewarding to see reinforcement learning deployed successfully in a real setting. It’s also been terrific to work alongside such fantastic co-authors: Marc G.

    December 2, 2020 Read
    Agence: a dynamic film exploring multi-agent systems and human agency

    Agence is a dynamic and interactive film authored by three parties: 1) the director, who establishes the narrative structure and environment, 2) intelligent agents, using reinforcement learning or scripted (hierarchical state machines) AI, and 3) the viewer, who can interact with the system to affect the simulation. We trained RL agents in a multi-agent fashion to control some (or all, based on user choice) of the agents in the film. You can download the game at the Agence website.

    December 1, 2020 Read
    GANterpretations

    GANterpretations is an idea I published in this paper, which was accepted to the 4th Workshop on Machine Learning for Creativity and Design at NeurIPS 2020. The code is available here. At a high level what it does is use the spectrogram of a piece of audio (from a video, for example) to “draw” a path in the latent space of a BigGAN. The following video walks through the process:

    November 8, 2020 Read
    Introduction to reinforcement learning

    This post is based on this colab. You can also watch a video where I go through the basics here. Pueden ver un video (en español) donde presento el material aquí. Introduction Reinforcement learning methods are used for sequential decision making in uncertain environments. It is typically framed as an agent (the learner) interacting with an environment which provides the agent with reinforcement (positive or negative), based on the agent’s decisions.

    October 14, 2020 Read
    Rigging the Lottery: Making All Tickets Winners

    Rigging the Lottery: Making All Tickets Winners is a paper published at ICML 2020 with Utku Evci, Trevor Gale, Jacob Menick, and Erich Elsen, where we introduce an algorithm for training sparse neural networks that uses a fixed parameter count and computational cost throughout training, without sacrificing accuracy relative to existing dense-to-sparse training methods. You can read more about it in the paper and in our blog post.

    September 16, 2020 Read
    Artificial General Relativity

    We (well, I) introduce a New Field In Science which we (I mean I) call Artificial General Relativity. We (here I really mean “we”) have all heard of General Relativity and how it revolutionized our understanding of the world around us. Einstein’s work, although pivotal, failed in one crucial aspect: although it allowed us to describe gravity and spacetime, it did not allow us to control them. In this paper I (switching to “I” to avoid sounding pretentious with “we”) introduce Artificial General Relativity (AGR) which, when achieved, will allow us to control gravity and spacetime.

    April 1, 2020 Read
    GridWorld Playground

    GridWorld playground! I made a website where you can Draw your own GridWorlds Play around with hyperparameters while agent is training Transfer values between agents “Teleport” the agent to help it during learning Hope you find it useful and fun!

    March 16, 2020 Read
    Tips for Interviewing at Google

    Disclaimer: This post reflects my personal views and not those of my employer. People often ask me: How do I get a job at Google? An essential requirement is passing the interviews; unsurprisingly, this is another common question: How do I pass the Google interviews? While there is no hard and fast rule to pass the Google interviews, I do have some tips and guidelines that have helped others in the past (including myself).

    February 24, 2020 Read
    Tips for preparing your resume

    Disclaimer: This post reflects my personal views and not those of my employer. In my previous post providing tips for interviewing at Google, I included the sentence “If you don’t know anyone at Google, you’ve already applied and haven’t heard back in a while, feel free to send me a note with your CV and I’ll see if there’s something I can do.” I received a number of requests from people who had applied but never heard back.

    February 24, 2020 Read
    Scalable methods for computing state similarity in deterministic MDPs

    This post describes my paper Scalable methods for computing state similarity in deterministic MDPs, published at AAAI 2020. The code is available here. Motivation We consider distance metrics between states in an MDP. Take the following MDP, where the goal is to reach the green cells: Physical distance betweent states? Physical distance often fails to capture the similarity properties we’d like: State abstractions Now imagine we add an exact copy of these states to the MDP (think of it as an additional “floor”):

    November 22, 2019 Read
    • ««
    • «
    • 1
    • 2
    • 3
    • 4
    • »
    • »»
    Navigation
    • About
    • Recent Posts
    • Publications
    Contact me:
    • Twitter
    Mastodon

    Toha
    © 2020 Copyright.
    Powered by Hugo Logo