谷歌中国开发者社区 (GDG)
  • 主页
  • 博客
    • Android
    • Design
    • GoogleCloud
    • GoogleMaps
    • GooglePlay
    • Web
  • 社区
    • 各地社区
    • 社区历史
    • GDG介绍
    • 社区通知
  • 视频
  • 资源
    • 资源汇总
    • 精选视频
    • 优酷频道

Dopamine 2.0: providing more flexibility in reinforcement learning research

2019-02-07adminGoogleDevFeedsNo comments

Source: Dopamine 2.0: providing more flexibility in reinforcement learning research from Open Source

Reinforcement learning (RL) has become one of the most popular fields of machine learning, and has seen a number of great advances over the last few years. As a result, there is a growing need from both researchers and educators to have access to a clear and reliable framework for RL research and education.

Last August, we announced Dopamine, our framework for flexible reinforcement learning.  For the initial version we decided to focus on a specific type of RL research: value-based agents evaluated on the Atari 2600 framework supported by the Arcade Learning Environment. We were thrilled to see how well it was received by the community, including a live coding session, its inclusion in a recently-announced benchmark for RL, considered as the top “Cool new open source project of 2018” by the Octoverse, and over 7K GitHub stars on our repository.

One of the most common requests we have received is support for more environments. This confirms what we have seen internally, where simpler environments, such as those supported by OpenAI’s Gym, are incredibly useful when testing out new algorithms. We are happy to announce Dopamine 2.0, which includes support for discrete-domain gym environments (e.g. discrete states and actions). The core of the framework remains unchanged, we have simply generalized the interface with the environment. For backwards compatibility, users will still be able to download version 1.0.

We include default configurations for two classic control environments: CartPole and Acrobot; on these environments one can train a Dopamine agent in minutes. When compared with the training time for a standard Atari 2600 game (around 5 days on a standard GPU), these environments allow researchers to iterate much faster on research ideas before testing them out on larger Atari games. We also include a Colaboratory that illustrates how to train an agent on Cartpole and Acrobot. Finally, our GymPreprocessing class serves as an example for how to use Dopamine with other custom environments.

We are excited by the new opportunities enabled by Dopamine 2.0, and look forward to seeing what the research community creates with it!

By Pablo Samuel Castro and Marc G. Bellemare, Dopamine Team

除非特别声明,此文章内容采用知识共享署名 3.0许可,代码示例采用Apache 2.0许可。更多细节请查看我们的服务条款。

Tags: AdWords

Related Articles

Simplify Cloud VPC firewall management with service accounts

2018-01-05admin

3 Ways to Customize your Firebase Dynamic Links

2018-06-13admin

Predictions graduates to general availability to provide smarter user segmentation

2018-11-16admin

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">

Recent Posts

  • New UI tools and a richer creative canvas come to ARCore
  • Introducing PlaNet: A Deep Planning Network for Reinforcement Learning
  • AI in depth: monitoring home appliances from power readings with ML
  • AI in depth: monitoring home appliances from power readings with ML
  • AI in depth: monitoring home appliances from power readings with ML

Recent Comments

  • Chen Zhixiang on Concurrent marking in V8
  • admin on 使用 Android Jetpack 加快应用开发速度
  • 怪盗kidou on 使用 Android Jetpack 加快应用开发速度
  • 鸿维 on Google 帐号登录 API 更新
  • admin on 推出 CVPR 2018 学习图像压缩挑战赛

Archives

  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • January 1970

Categories

  • Android
  • Design
  • Firebase
  • GoogleCloud
  • GoogleDevFeeds
  • GoogleMaps
  • GooglePlay
  • Google动态
  • iOS
  • Uncategorized
  • VR
  • Web
  • WebMaster
  • 社区
  • 通知

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.org

最新文章

  • New UI tools and a richer creative canvas come to ARCore
  • Introducing PlaNet: A Deep Planning Network for Reinforcement Learning
  • AI in depth: monitoring home appliances from power readings with ML
  • AI in depth: monitoring home appliances from power readings with ML
  • AI in depth: monitoring home appliances from power readings with ML
  • Run cron jobs reliably on Compute Engine with Cloud Scheduler
  • Run cron jobs reliably on Compute Engine with Cloud Scheduler
  • Run cron jobs reliably on Compute Engine with Cloud Scheduler
  • Introducing scheduled snapshots for Compute Engine persistent disk
  • Revevol: How we built a BI dashboard with GCP to track G Suite adoption

最多查看

  • 谷歌招聘软件工程师 (21,002)
  • Google 推出的 31 套在线课程 (20,071)
  • 如何选择 compileSdkVersion, minSdkVersion 和 targetSdkVersion (18,624)
  • Seti UI 主题: 让你编辑器焕然一新 (12,669)
  • Android Studio 2.0 稳定版 (8,958)
  • Android N 最初预览版:开发者 API 和工具 (7,934)
  • 像 Sublime Text 一样使用 Chrome DevTools (5,947)
  • Google I/O 2016: Android 演讲视频汇总 (5,519)
  • 用 Google Cloud 打造你的私有免费 Git 仓库 (5,496)
  • 面向普通开发者的机器学习应用方案 (5,193)
  • 生还是死?Android 进程优先级详解 (4,969)
  • 面向 Web 开发者的 Sublime Text 插件 (4,134)
  • 适配 Android N 多窗口特性的 5 个要诀 (4,103)
  • 参加 Google I/O Extended,观看 I/O 直播,线下聚会! (3,472)
© 2018 中国谷歌开发者社区 - ChinaGDG