200字范文,内容丰富有趣,生活中的好帮手!
200字范文 > 深度强化学习- 最全深度强化学习资料

深度强化学习- 最全深度强化学习资料

时间:2019-06-22 02:01:04

相关推荐

深度强化学习- 最全深度强化学习资料

获取更多资讯,赶快关注上面的公众号吧!

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。 本文链接:/gsww404/article/details/103074046

本工作是一项由深度强化学习实验室(Deep Reinforcement Learning Laboratory, DeepRL-Lab)发起的项目。

文章同步于Github仓库

/NeuronDance/DeepRL/tree/master/A-Guide-Resource-For-DeepRL(点击进入GitHub)

欢迎大家Star, Fork和Contribution.

Contents

1. Books2. Courses3. Survey-and-Frontier4. Environment-and-Framework5. Baselines-and-Benchmarks6. Algorithm7. Applications8. Advanced-Topics9. Relate-Coureses10. Multi-Agents11. Paper-Resources

1. Books

Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (),Chinese-Edtion, CodeAlgorithms for Reinforcement Learning by Csaba Szepesvari (updated )Deep Reinforcement Learning Hands-On by Maxim Lapan (),CodeReinforcement learning, State-Of-The- Art by Marco Wiering, Martijin van OtterloDeep Reinforcement Learning in Action by Alexander Zai and Brandon Brown (in progress)Grokking Deep Reinforcement Learning by Miguel Morales (in progress)Multi-Agent Machine Learning A Reinforcement Approach【百度云链接】 by Howard M.Schwartz()强化学习在阿里的技术演进与业务创新 by Alibaba GroupHands-On Reinforcement Learning with Python(百度云链接)Reinforcement Learning And Optimal Control by Dimitri P. Bertsekas,

2. Courses

UCL Course on RL(★★★) by David Sliver, Video-en,Video-zh

OpenAI’s Spinning Up in Deep RL by OpenAI()

Udacity-Deep Reinforcement learning, -10-31

Stanford CS-234: Reinforcement Learning (), Videos

DeepMind Advanced Deep Learning & Reinforcement Learning (),Videos

GeorgiaTech CS-8803 Deep Reinforcement Learning (?)

UC Berkeley CS294-112 Deep Reinforcement Learning ( Fall),Video-zh

Deep RL Bootcamp by Berkeley CA()

Thomas Simonini’s Deep Reinforcement Learning Course

CS-6101 Deep Reinforcement Learning , NUS SoC, /, Semester II

Course on Reinforcement Learning by Alessandro Lazaric,

Learn Deep Reinforcement Learning in 60 days

3. Survey-and-Frontier

Deep Reinforcement Learning by Yuxi Li

Algorithms for Reinforcement Learning by Morgan & Claypool,

Modern Deep Reinforcement Learning Algorithms by Sergey Ivanov(54-Page)

Deep Reinforcement Learning: An Overview ()

A Brief Survey of Deep Reinforcement Learning ()

Deep Reinforcement Learning Doesn’t Work Yet(★) by Irpan, Alex(), ChineseVersion

Deep Reinforcement Learning that Matters(★) by Peter Henderson1, Riashat Islam1

A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress

Applications of Deep Reinforcement Learning in Communications and Networking: A Survey

An Introduction to Deep Reinforcement Learning

Challenges of Real-World Reinforcement Learning

Topics in Reinforcement Learning

Reinforcement Learning: A Survey,1996.

A Tutorial Survey of Reinforcement Learning, Sadhana,1994.

Reinforcement Learning in Robotics, A Survey,

A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to Imitation.,

Universal Reinforcement Learning Algorithms: Survey and Experiments,

Bayesian Reinforcement Learning: A Survey,

Benchmarking Reinforcement Learning Algorithms on Real-World Robots

4. Environment-and-Framework

OpenAI Gym (GitHub) (docs)

rllab (GitHub) (readthedocs)

Ray (Doc)

Dopamine: /google/dopamine (uses some tensorflow)

trfl: /deepmind/trfl (uses tensorflow)

ChainerRL (GitHub) (API: Python)

Surreal GitHub (API: Python) (support: Stanford Vision and Learning Lab).Paper

PyMARL GitHub (support: http://whirl.cs.ox.ac.uk/)

TF-Agents: /tensorflow/agents (uses tensorflow)

TensorForce (GitHub) (uses tensorflow)

RL-Glue (Google Code Archive) (API: C/C++, Java, Matlab, Python, Lisp) (support: Alberta)

MAgent /geek-ai/MAgent (uses tensorflow)

RLlib http://ray.readthedocs.io/en/latest/rllib.html (API: Python)

http://burlap.cs.brown.edu/ (API: Java)

rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch

robotics-rl-srl - S-RL Toolbox: Reinforcement Learning (RL) and State Representation Learning (SRL) for Robotics

pysc2: StarCraft II Learning Environment

Arcade-Learning-Environment

OpenAI universe - A software platform for measuring and training an AI’s general intelligence across the world’s supply of games, websites and other applications

DeepMind Lab - A customisable 3D platform for agent-based AI research

Project Malmo - A platform for Artificial Intelligence experimentation and research built on top of Minecraft by Microsoft

Retro Learning Environment - An AI platform for reinforcement learning based on video game emulators. Currently supports SNES and Sega Genesis. Compatible with OpenAI gym.

torch-twrl - A package that enables reinforcement learning in Torch by Twitter

UETorch - A Torch plugin for Unreal Engine 4 by Facebook

TorchCraft - Connecting Torch to StarCraft

rllab - A framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym

TensorForce - Practical deep reinforcement learning on TensorFlow with Gitter support and OpenAI Gym/Universe/DeepMind Lab integration.

OpenAI lab - An experimentation system for Reinforcement Learning using OpenAI Gym, Tensorflow, and Keras.

keras-rl - State-of-the art deep reinforcement learning algorithms in Keras designed for compatibility with OpenAI.

BURLAP - Brown-UMBC Reinforcement Learning and Planning, a library written in Java

MAgent - A Platform for Many-agent Reinforcement Learning.

Ray RLlib - Ray RLlib is a reinforcement learning library that aims to provide both performance and composability.

SLM Lab - A research framework for Deep Reinforcement Learning using Unity, OpenAI Gym, PyTorch, Tensorflow.

Unity ML Agents - Create reinforcement learning environments using the Unity Editor

Intel Coach - Coach is a python reinforcement learning research framework containing implementation of many state-of-the-art algorithms.

ELF - An End-To-End, Lightweight and Flexible Platform for Game Research

Unity ML-Agents Toolkit

rlkit

/envs/#classic_control

/erlerobot/gym-gazebo

/robotology/gym-ignition

/dartsim/gym-dart

/Roboy/gym-roboy

/openai/retro

/openai/gym-soccer

/duckietown/gym-duckietown

/Unity-Technologies/ml-agents (Unity, multiagent)

/koulanurag/ma-gym (multiagent)

/ucuapps/modelicagym

/mwydmuch/ViZDoom

/benelot/pybullet-gym

/Healthcare-Robotics/assistive-gym

/Microsoft/malmo

/nadavbh12/Retro-Learning-Environment

/twitter/torch-twrl

/arex18/rocket-lander

/ppaquette/gym-doom

/thedimlebowski/Trading-Gym

/Phylliade/awesome-openai-gym-environments

/deepmind/pysc2 (by DeepMind) (Blizzard StarCraft II Learning Environment (SC2LE) component)

5. Baselines-and-Benchmarks

/openai/baselines 【stalbe-baseline】rl-baselines-zooROBEL (google-research/robel)RLBench (stepjam/RLBench)https://martin-/sota/#reinforcment-learning/rlworkgroup/garageAtari Environments Scores

6. Algorithms

1. DQN serial
Playing Atari with Deep Reinforcement Learning [arxiv] [code]Deep Reinforcement Learning with Double Q-learning [arxiv] [code]Dueling Network Architectures for Deep Reinforcement Learning [arxiv] [code]Prioritized Experience Replay [arxiv] [code]Noisy Networks for Exploration [arxiv] [code]A Distributional Perspective on Reinforcement Learning [arxiv] [code]Rainbow: Combining Improvements in Deep Reinforcement Learning [arxiv] [code]
2. Others

Algorithm Codeing
Deep-Reinforcement-Learning-Algorithms-with-PyTorch

#7. Applications

7.1 Basic

Reinforcement Learning ApplicationsIntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control by Hua Wei,Guanjie Zheng()Deep Reinforcement Learning by Yuxi Li, Deep Reinforcement Learning in Robotics

7.2 Robotics

Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion (Kohl, ICRA ) [Paper]Robot Motor SKill Coordination with EM-based Reinforcement Learning (Kormushev, IROS ) [Paper] [Video]Generalized Model Learning for Reinforcement Learning on a Humanoid Robot (Hester, ICRA ) [Paper] [Video]Autonomous Skill Acquisition on a Mobile Manipulator (Konidaris, AAAI ) [Paper] [Video]PILCO: A Model-Based and Data-Efficient Approach to Policy Search (Deisenroth, ICML ) [Paper]Incremental Semantically Grounded Learning from Demonstration (Niekum, RSS ) [Paper]Efficient Reinforcement Learning for Robots using Informative Simulated Priors (Cutler, ICRA ) [Paper] [Video]Robots that can adapt like animals (Cully, Nature ) [Paper] [Video] [Code]Black-Box Data-efficient Policy Search for Robotics (Chatzilygeroudis, IROS ) [Paper] [Video] [Code]

#8. Advanced-Topics

8.1. Model-free RL

playing atari with deep reinforcement learningNIPS Deep Learning Workshop . paper

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller

Human-level control through deep reinforcement learningNature . paper

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg & Demis Hassabis

Deep Reinforcement Learning with Double Q-learningAAAI 16. paper

Hado van Hasselt, Arthur Guez, David Silver

Dueling Network Architectures for Deep Reinforcement LearningICML16. paper

Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas

Deep Recurrent Q-Learning for Partially Observable MDPsAAA15. paper

Matthew Hausknecht, Peter Stone

Prioritized Experience ReplayICLR . paper

Tom Schaul, John Quan, Ioannis Antonoglou, David Silver

Asynchronous Methods for Deep Reinforcement LearningICML. paper

Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu

A Distributional Perspective on Reinforcement LearningICML. paper

Marc G. Bellemare, Will Dabney, Rémi Munos

Noisy Networks for ExplorationICLR. paper

Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg

Rainbow: Combining Improvements in Deep Reinforcement LearningAAAI. paper

Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver

8.2. Model-based RL

Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value ExpansionNIPS. paper

Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee

Model-Based Value Estimation for Efficient Model-Free Reinforcement LearningICML.paper

Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael I. Jordan, Joseph E. Gonzalez, Sergey Levine

Value Prediction NetworkNIPS. paper

Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael I. Jordan, Joseph E. Gonzalez, Sergey Levine

Imagination-Augmented Agents for Deep Reinforcement LearningNIPS. paper

Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra

Continuous Deep Q-Learning with Model-based AccelerationICML. paper

Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

Uncertainty-driven Imagination for Continuous Deep Reinforcement LearningCoRL. paper

Gabriel Kalweit, Joschka Boedecker

Model-Ensemble Trust-Region Policy OptimizationICLR. paper

Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, Pieter Abbeel

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics ModelsNIPS. paper

Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine

Dyna, an integrated architecture for learning, planning, and reactingACM1991. paper

Sutton, Richard S

Learning Continuous Control Policies by Stochastic Value GradientsNIPS . paper

Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Yuval Tassa, Tom Erez

Imagination-Augmented Agents for Deep Reinforcement LearningNIPS . paper

Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra

Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural NetworksICLR . paper

Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft

8.3 Function Approximation methods (Least-Square Temporal Difference, Least-Square Policy Iteration)

Linear Least-Squares Algorithms for Temporal Difference Learning, Machine Learning, 1996. [Paper]Model-Free Least Squares Policy Iteration, NIPS, 2001. [Paper] [Code]

8.4 Policy Search/Policy Gradient

Policy Gradient Methods for Reinforcement Learning with Function Approximation, NIPS, 1999. [Paper]Natural Actor-Critic, ECML, . [Paper]Policy Search for Motor Primitives in Robotics, NIPS, . [Paper]Relative Entropy Policy Search, AAAI, . [Paper]Path Integral Policy Improvement with Covariance Matrix Adaptation, ICML, . [Paper]Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion, ICRA, . [Paper]PILCO: A Model-Based and Data-Efficient Approach to Policy Search, ICML, . [Paper]Learning Dynamic Arm Motions for Postural Recovery, Humanoids, . [Paper]Black-Box Data-efficient Policy Search for Robotics, IROS, . [Paper]

8.5 Hierarchical RL

Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning, Artificial Intelligence, 1999. [Paper]Building Portable Options: Skill Transfer in Reinforcement Learning, IJCAI, . [Paper]

8.6 Inverse RL

updating…

8.7 Meta RL

updating…

8.8. Rewards

Deep Reinforcement Learning Models: Tips & Tricks for Writing Reward FunctionsMeta Reward Learning

8.9. Policy Gradient

Policy Gradient

8.10. Distributed Reinforcement Learning

Asynchronous Methods for Deep Reinforcement Learning by ICML .paperGA3C: GPU-based A3C for Deep Reinforcement Learning by Iuri Frosio, Stephen Tyree, NIPS Distributed Prioritized Experience Replay by Dan Horgan, John Quan, David Budden,ICLR IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures by Lasse Espeholt, Hubert Soyer, Remi Munos ,ICML Distributed Distributional Deterministic Policy Gradients by Gabriel Barth-Maron, Matthew W. Hoffman, ICLR .Emergence of Locomotion Behaviours in Rich Environments by Nicolas Heess, Dhruva TB, Srinivasan Sriram, GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning by Jacky Liang, Viktor Makoviychuk, Recurrent Experience Replay in Distributed Reinforcement Learning bySteven Kapturowski, Georg Ostrovski, ICLR .

#9. Relate-Coureses

9.1. Game Theory

Game Theory Course, Yale UniversityGame Theory - The Full Course, Stanford UniversityAlgorithmic Game Theory (CS364A, Fall ) , Stanford University

9.2. other

#10. Multi-Agents

10.1 Tutorial and Books

Deep Multi-Agent Reinforcement Learning by Jakob N Foerster, . PhD Thesis.Multi-Agent Machine Learning: A Reinforcement Approach by H. M. Schwartz, .Multiagent Reinforcement Learning by Daan Bloembergen, Daniel Hennes, Michael Kaisers, Peter Vrancx. ECML, .Multiagent systems: Algorithmic, game-theoretic, and logical foundations by Shoham Y, Leyton-Brown K. Cambridge University Press, .

10.2 Review Papers

A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems by Silva, Felipe Leno da; Costa, Anna Helena Reali. JAIR, .Autonomously Reusing Knowledge in Multiagent Reinforcement Learning by Silva, Felipe Leno da; Taylor, Matthew E.; Costa, Anna Helena Reali. IJCAI, .Deep Reinforcement Learning Variants of Multi-Agent Learning Algorithms by Castaneda A O. .Evolutionary Dynamics of Multi-Agent Learning: A Survey by Bloembergen, Daan, et al. JAIR, .Game theory and multi-agent reinforcement learning by Nowé A, Vrancx P, De Hauwere Y M. Reinforcement Learning. Springer Berlin Heidelberg, .Multi-agent reinforcement learning: An overview by Buşoniu L, Babuška R, De Schutter B. Innovations in multi-agent systems and applications-1. Springer Berlin Heidelberg, A comprehensive survey of multi-agent reinforcement learning by Busoniu L, Babuska R, De Schutter B. IEEE Transactions on Systems Man and Cybernetics Part C Applications and Reviews, If multi-agent learning is the answer, what is the question? by Shoham Y, Powers R, Grenager T. Artificial Intelligence, .From single-agent to multi-agent reinforcement learning: Foundational concepts and methods by Neto G. Learning theory course, .Evolutionary game theory and multi-agent reinforcement learning by Tuyls K, Nowé A. The Knowledge Engineering Review, .An Overview of Cooperative and Competitive Multiagent Learning by Pieter Jan ’t HoenKarl TuylsLiviu PanaitSean LukeJ. A. La Poutré. AAMAS’s workshop LAMAS, .Cooperative multi-agent learning: the state of the art by Liviu Panait and Sean Luke, .

10.3 Framework papers

Mean Field Multi-Agent Reinforcement Learning by Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, and Jun Wang. ICML .Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments by Lowe R, Wu Y, Tamar A, et al. arXiv, .Deep Decentralized Multi-task Multi-Agent RL under Partial Observability by Omidshafiei S, Pazis J, Amato C, et al. arXiv, .Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games by Peng P, Yuan Q, Wen Y, et al. arXiv, .Robust Adversarial Reinforcement Learning by Lerrel Pinto, James Davidson, Rahul Sukthankar, Abhinav Gupta. arXiv, .Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning by Foerster J, Nardelli N, Farquhar G, et al. arXiv, .Multiagent reinforcement learning with sparse interactions by negotiation and knowledge transfer by Zhou L, Yang P, Chen C, et al. IEEE transactions on cybernetics, .Decentralised multi-agent reinforcement learning for dynamic and uncertain environments by Marinescu A, Dusparic I, Taylor A, et al. arXiv, .CLEANing the reward: counterfactual actions to remove exploratory action noise in multiagent learning by HolmesParker C, Taylor M E, Agogino A, et al. AAMAS, .Bayesian reinforcement learning for multiagent systems with state uncertainty by Amato C, Oliehoek F A. MSDM Workshop, .Multiagent learning: Basics, challenges, and prospects by Tuyls, Karl, and Gerhard Weiss. AI Magazine, .Classes of multiagent q-learning dynamics with epsilon-greedy exploration by Wunder M, Littman M L, Babes M. ICML, .Conditional random fields for multi-agent reinforcement learning by Zhang X, Aberdeen D, Vishwanathan S V N. ICML, .Multi-agent reinforcement learning using strategies and voting by Partalas, Ioannis, Ioannis Feneris, and Ioannis Vlahavas. ICTAI, .A reinforcement learning scheme for a partially-observable multi-agent game by Ishii S, Fujita H, Mitsutake M, et al. Machine Learning, .Asymmetric multiagent reinforcement learning by Könönen V. Web Intelligence and Agent Systems, .Adaptive policy gradient in multiagent learning by Banerjee B, Peng J. AAMAS, .Reinforcement learning to play an optimal Nash equilibrium in team Markov games by Wang X, Sandholm T. NIPS, 2002.Multiagent learning using a variable learning rate by Michael Bowling and Manuela Veloso, 2002.Value-function reinforcement learning in Markov game by Littman M L. Cognitive Systems Research, 2001.Hierarchical multi-agent reinforcement learning by Makar, Rajbala, Sridhar Mahadevan, and Mohammad Ghavamzadeh. The fifth international conference on Autonomous agents, 2001.An analysis of stochastic game theory for multiagent reinforcement learning by Michael Bowling and Manuela Veloso, 2000.

10.4 Joint action learning

AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents by Conitzer V, Sandholm T. Machine Learning, .Extending Q-Learning to General Adaptive Multi-Agent Systems by Tesauro, Gerald. NIPS, .Multiagent reinforcement learning: theoretical framework and an algorithm. by Hu, Junling, and Michael P. Wellman. ICML, 1998.The dynamics of reinforcement learning in cooperative multiagent systems by Claus C, Boutilier C. AAAI, 1998.Markov games as a framework for multi-agent reinforcement learning by Littman, Michael L. ICML, 1994.

10.5 Cooperation and competition

Emergent complexity through multi-agent competition by Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, Igor Mordatch, .Learning with opponent learning awareness by Jakob Foerster, Richard Y. Chen2, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch, .Multi-agent Reinforcement Learning in Sequential Social Dilemmas by Leibo J Z, Zambaldi V, Lanctot M, et al. arXiv, . [Post]Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies with PAC Bounds by Roi Ceren, Prashant Doshi, and Bikramjit Banerjee, pp. 530-538, AAMAS .Opponent Modeling in Deep Reinforcement Learning by He H, Boyd-Graber J, Kwok K, et al. ICML, .Multiagent cooperation and competition with deep reinforcement learning by Tampuu A, Matiisen T, Kodelja D, et al. arXiv, .Emotional multiagent reinforcement learning in social dilemmas by Yu C, Zhang M, Ren F. International Conference on Principles and Practice of Multi-Agent Systems, .Multi-agent reinforcement learning in common interest and fixed sum stochastic games: An experimental study by Bab, Avraham, and Ronen I. Brafman. Journal of Machine Learning Research, bining policy search with planning in multi-agent cooperation by Ma J, Cameron S. Robot Soccer World Cup, .Collaborative multiagent reinforcement learning by payoff propagation by Kok J R, Vlassis N. JMLR, .Learning to cooperate in multi-agent social dilemmas by de Cote E M, Lazaric A, Restelli M. AAMAS, .Learning to compete, compromise, and cooperate in repeated general-sum games by Crandall J W, Goodrich M A. ICML, .Sparse cooperative Q-learning by Kok J R, Vlassis N. ICML, .

10.6 Coordination

Coordinated Multi-Agent Imitation Learning by Le H M, Yue Y, Carr P. arXiv, .Reinforcement social learning of coordination in networked cooperative multiagent systems by Hao J, Huang D, Cai Y, et al. AAAI Workshop, .Coordinating multi-agent reinforcement learning with limited communication by Zhang, Chongjie, and Victor Lesser. AAMAS, .Coordination guided reinforcement learning by Lau Q P, Lee M L, Hsu W. AAMAS, .Coordination in multiagent reinforcement learning: a Bayesian approach by Chalkiadakis G, Boutilier C. AAMAS, .Coordinated reinforcement learning by Guestrin C, Lagoudakis M, Parr R. ICML, 2002.Reinforcement learning of coordination in cooperative multi-agent systems by Kapetanakis S, Kudenko D. AAAI/IAAI, 2002.

10.7 Security

Markov Security Games: Learning in Spatial Security Problems by Klima R, Tuyls K, Oliehoek F. The Learning, Inference and Control of Multi-Agent Systems at NIPS, .Cooperative Capture by Multi-Agent using Reinforcement Learning, Application for Security Patrol Systems by Yasuyuki S, Hirofumi O, Tadashi M, et al. Control Conference (ASCC), Improving learning and adaptation in security games by exploiting information asymmetry by He X, Dai H, Ning P. INFOCOM, .

10.8 Self-Play

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning by Marc Lanctot, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Perolat, David Silver, Thore Graepel. NIPS .Deep reinforcement learning from self-play in imperfect-information games by Heinrich, Johannes, and David Silver. arXiv, .Fictitious Self-Play in Extensive-Form Games by Heinrich, Johannes, Marc Lanctot, and David Silver. ICML, .

10.9 Learning To Communicate

Emergent Communication through Negotiation by Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z Leibo, Karl Tuyls, Stephen Clark, .Emergence of Linguistic Communication From Referential Games with Symbolic and Pixel Input by Angeliki Lazaridou, Karl Moritz Hermann, Karl Tuyls, Stephen ClarkEMERGENCE OF LANGUAGE WITH MULTI-AGENT GAMES: LEARNING TO COMMUNICATE WITH SEQUENCES OF SYMBOLS by Serhii Havrylov, Ivan Titov. ICLR Workshop, .Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning by Abhishek Das, Satwik Kottur, et al. arXiv, .Emergence of Grounded Compositional Language in Multi-Agent Populations by Igor Mordatch, Pieter Abbeel. arXiv, . [Post]Cooperation and communication in multiagent deep reinforcement learning by Hausknecht M J. .Multi-agent cooperation and the emergence of (natural) language by Lazaridou A, Peysakhovich A, Baroni M. arXiv, .Learning to communicate to solve riddles with deep distributed recurrent q-networks by Foerster J N, Assael Y M, de Freitas N, et al. arXiv, .Learning to communicate with deep multi-agent reinforcement learning by Foerster J, Assael Y M, de Freitas N, et al. NIPS, .Learning multiagent communication with backpropagation by Sukhbaatar S, Fergus R. NIPS, .Efficient distributed reinforcement learning through agreement by Varshavskaya P, Kaelbling L P, Rus D. Distributed Autonomous Robotic Systems, .

10.10 Transfer Learning

Simultaneously Learning and Advising in Multiagent Reinforcement Learning by Silva, Felipe Leno da; Glatt, Ruben; and Costa, Anna Helena Reali. AAMAS, .Accelerating Multiagent Reinforcement Learning through Transfer Learning by Silva, Felipe Leno da; and Costa, Anna Helena Reali. AAAI, .Accelerating multi-agent reinforcement learning with dynamic co-learning by Garant D, da Silva B C, Lesser V, et al. Technical report, Transfer learning in multi-agent systems through parallel transfer by Taylor, Adam, et al. ICML, .Transfer learning in multi-agent reinforcement learning domains by Boutsioukis, Georgios, Ioannis Partalas, and Ioannis Vlahavas. European Workshop on Reinforcement Learning, .Transfer Learning for Multi-agent Coordination by Vrancx, Peter, Yann-Michaël De Hauwere, and Ann Nowé. ICAART, .

10.11 Imitation and Inverse Reinforcement Learning

Multi-Agent Adversarial Inverse Reinforcement Learning by Lantao Yu, Jiaming Song, Stefano Ermon. ICML .Multi-Agent Generative Adversarial Imitation Learning by Jiaming Song, Hongyu Ren, Dorsa Sadigh, Stefano Ermon. NeurIPS .Cooperative inverse reinforcement learning by Hadfield-Menell D, Russell S J, Abbeel P, et al. NIPS, parison of Multi-agent and Single-agent Inverse Learning on a Simulated Soccer Example by Lin X, Beling P A, Cogill R. arXiv, .Multi-agent inverse reinforcement learning for zero-sum games by Lin X, Beling P A, Cogill R. arXiv, .Multi-robot inverse reinforcement learning under occlusion with interactions by Bogert K, Doshi P. AAMAS, .Multi-agent inverse reinforcement learning by Natarajan S, Kunapuli G, Judah K, et al. ICMLA, .

10.12 Meta Learning

Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments by l-Shedivat, M. .

10.13 Application

MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence by Zheng L et al. NIPS & AAAI Demo. (Github Page)Collaborative Deep Reinforcement Learning for Joint Object Search by Kong X, Xin B, Wang Y, et al. arXiv, .Multi-Agent Stochastic Simulation of Occupants for Building Simulation by Chapman J, Siebers P, Darren R. Building Simulation, .Extending No-MASS: Multi-Agent Stochastic Simulation for Demand Response of residential appliances by Sancho-Tomás A, Chapman J, Sumner M, Darren R. Building Simulation, .Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving by Shalev-Shwartz S, Shammah S, Shashua A. arXiv, .Applying multi-agent reinforcement learning to watershed management by Mason, Karl, et al. Proceedings of the Adaptive and Learning Agents workshop at AAMAS, .Crowd Simulation Via Multi-Agent Reinforcement Learning by Torrey L. AAAI, .Traffic light control by multiagent reinforcement learning systems by Bakker, Bram, et al. Interactive Collaborative Information Systems, .Multiagent reinforcement learning for urban traffic control using coordination graphs by Kuyer, Lior, et al. oint European Conference on Machine Learning and Knowledge Discovery in Databases, .A multi-agent Q-learning framework for optimizing stock trading systems by Lee J W, Jangmin O. DEXA, 2002.Multi-agent reinforcement learning for traffic light control by Wiering, Marco. ICML. 2000.

#11. Paper-Resources

-07

Benchmarking Model-Based Reinforcement LearningLearning World Graphs to Accelerate

Hierarchical Reinforcement LearningPerspective Taking in Deep Reinforcement Learning AgentsOn the Weaknesses of Reinforcement Learning for Neural Machine TranslationDynamic Face Video Segmentation via Reinforcement LearningStriving for Simplicity in Off-policy Deep Reinforcement LearningIntrinsic Motivation Driven Intuitive Physics Learning using Deep Reinforcement Learning with Intrinsic Reward NormalizationA Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement LearningAttentive Multi-Task Deep Reinforcement LearningLow Level Control of a Quadrotor with Deep Model-Based Reinforcement LearningGoogle Research Football: A Novel Reinforcement Learning EnvironmentDeep Reinforcement Learning in Financial MarketsDynamic Input for Deep Reinforcement Learning in Autonomous DrivingCharacterizing Attacks on Deep Reinforcement LearningDeep Reinforcement Learning for Clinical Decision Support: A Brief SurveyVRLS: A Unified Reinforcement Learning Scheduler for Vehicle-to-Vehicle CommunicationsDeep Reinforcement Learning for Autonomous Internet of Things: Model, Applications and ChallengesArena: a toolkit for Multi-Agent Reinforcement LearningGPU-Accelerated Atari Emulation for Reinforcement LearningPhotonic architecture for reinforcement learning

Jun

Towards Empathic Deep Q-LearningRanking Policy GradientHyp-RL : Hyperparameter Optimization by Reinforcement LearningModern Deep Reinforcement Learning AlgorithmsA Framework for Automatic Question Generation from Text using Deep Reinforcement LearningDeep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular NetworksIs multiagent deep reinforcement learning the answer or the question? A brief surveyFinding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement LearningCooperative Lane Changing via Deep Reinforcement LearningA Hierarchical Architecture for Sequential Decision-Making in Autonomous Driving using Deep Reinforcement LearningExplaining Reinforcement Learning to Mere Mortals: An Empirical StudyLanguage as an Abstraction for Hierarchical Deep Reinforcement LearningAutonomous Airline Revenue Management: A Deep Reinforcement Learning Approach to Seat Inventory Control and OverbookingA Survey of Reinforcement Learning Informed by Natural LanguageLoad Balancing for Ultra-Dense Networks: A Deep Reinforcement Learning Based ApproachDeep Reinforcement Learning Architecture for Continuous Power Allocation in High Throughput SatellitesHarnessing Reinforcement Learning for Neural Motion Planning

April-May

Reinforcement Learning with Probabilistic Guarantees for Autonomous DrivingAn Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning AgentsOn the Generalization Gap in Reparameterizable Reinforcement LearningTargeted Attacks on Deep Reinforcement Learning Agents through Adversarial ObservationsInverse Reinforcement Learning in Contextual MDPsTeaching on a Budget in Multi-Agent Deep Reinforcement LearningCoordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement LearningGeneration of Policy-Level Explanations for Reinforcement LearningA Control-Model-Based Approach for Reinforcement LearningInteractive Teaching Algorithms for Inverse Reinforcement LearningSnooping Attacks on Deep Reinforcement Learning

March

IRLAS: Inverse Reinforcement Learning for Architecture SearchLearning Hierarchical Teaching in Cooperative Multiagent Reinforcement LearningM3RL: Mind-aware Multi-agent Management Reinforcement LearningConcurrent Meta Reinforcement LearningHorizon: Facebook’s Open Source Applied Reinforcement Learning PlatformUsing Natural Language for Reward Shaping in Reinforcement LearningModel-Based Reinforcement Learning for AtariRLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical SystemsLearning Hierarchical Teaching in Cooperative Multiagent Reinforcement LearningHacking Google reCAPTCHA v3 using Reinforcement LearningReinforcement Learning and Inverse Reinforcement Learning with System 1 and System 2Deep Reinforcement Learning with Feedback-based ExplorationDeep Reinforcement Learning for Autonomous DrivingImproving Safety in Reinforcement Learning Using Model-Based Architectures and Human InterventionDeep Hierarchical Reinforcement Learning Based Recommendations via Multi-goals AbstractionExplaining Reinforcement Learning to Mere Mortals: An Empirical StudyLifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic SystemsOn the use of Deep Autoencoders for Efficient Embedded Reinforcement LearningAutoregressive Policies for Continuous Control Deep Reinforcement Learning

Feb

Distributional reinforcement learning with linear function approximationNovelty Search for Deep Reinforcement Learning Policy Network Weights by Action Sequence Edit Metric DistanceTsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement LearningDeep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and ApplicationsReinforcement Learning for Optimal Load Distribution Sequencing in Resource-Sharing SystemLearning to Schedule Communication in Multi-agent Reinforcement LearningOn Reinforcement Learning for Full-length Game of StarCraftImplicit Policy for Reinforcement LearningA Meta-MDP Approach to Exploration for Lifelong Reinforcement LearningVisual Rationalizations in Deep Reinforcement Learning for Atari GamesStatistics and Samples in Distributional Reinforcement LearningA Comparative Analysis of Expected and Distributional Reinforcement LearningLearn What Not to Learn: Action Elimination with Deep Reinforcement LearningSOLAR: Deep Structured Representations for Model-Based Reinforcement LearningFrom Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction FollowingInvestigating Generalisation in Continuous Deep Reinforcement LearningModel-Free Adaptive Optimal Control of Episodic Fixed-Horizon Manufacturing Processes using Reinforcement LearningCrowd-Robot Interaction: Crowd-aware Robot Navigation with Attention-based Deep Reinforcement LearningTowards the Next Generation Airline Revenue Management: A Deep Reinforcement Learning Approach to Seat Inventory Control and OverbookingParenting: Safe Reinforcement Learning from Human InputReinforcement Learning Without Backpropagation or a ClockMessage-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement LearningA new Potential-Based Reward Shaping for Reinforcement Learning AgentHow to Combine Tree-Search Methods in Reinforcement LearningUnsupervised Basis Function Adaptation for Reinforcement LearningCommunication Topologies Between Learning Agents in Deep Reinforcement LearningLogically-Constrained Reinforcement LearningHyperbolic Embeddings for Learning Options in Hierarchical Reinforcement LearningProLoNets: Neural-encoding Human Experts’ Domain Knowledge to Warm Start Reinforcement LearningA Framework for Automated Cellular Network Tuning with Reinforcement LearningDeep Reinforcement Learning for Search, Recommendation, and Online Advertising: A SurveyThe Value Function Polytope in Reinforcement LearningRobust Reinforcement Learning in POMDPs with Incomplete and Noisy ObservationsDeep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous TrafficActive Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement LearningVerifiably Safe Off-Model Reinforcement LearningOff-Policy Actor-Critic in an Ensemble: Achieving Maximum General Entropy and Effective Environment Exploration in Deep Reinforcement LearningOptimal Tap Setting of Voltage Regulation Transformers Using Batch Reinforcement LearningBayesian Action Decoder for Deep Multi-Agent Reinforcement LearningExploration versus exploitation in reinforcement learning: a stochastic control approachACTRCE: Augmenting Experience via Teacher’s Advice For Multi-Goal Reinforcement LearningEnd-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement LearningWiseMove: A Framework for Safe Deep Reinforcement Learning for Autonomous DrivingEmergence of Hierarchy via Reinforcement Learning Using a Multiple Timescale Stochastic RNN

Jan

Federated Reinforcement LearningVerifiable Reinforcement Learning via Policy ExtractionQFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless NetworksComplementary reinforcement learning towards explainable agentsThe Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) CompetitionHierarchical Reinforcement Learning for Multi-agent MOBA GameReinforcement Learning of Markov Decision Processes with Peak ConstraintsRobust Recovery Controller for a Quadrupedal Robot using Deep Reinforcement LearningUnderstanding Multi-Step Deep Reinforcement Learning: A Systematic Study of the DQN TargetGraph Convolutional Reinforcement Learning for Multi-Agent CooperationAlgorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical GuaranteesA Short Survey on Probabilistic Reinforcement LearningRead, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in VideosLifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic SystemsHierarchically Structured Reinforcement Learning for Topically Coherent Visual Story GenerationRecurrent Control Nets for Deep Reinforcement LearningAmplifying the Imitation Effect for Reinforcement Learning of UCAV’s Mission ExecutionMulti-agent Reinforcement Learning Embedded Game for the Optimization of Building Energy Control and Power System PlanningRepresentation Learning on Graphs: A Reinforcement Learning ApplicationEvolutionarily-Curated Curriculum Learning for Deep Reinforcement Learning AgentsExploring applications of deep reinforcement learning for real-world autonomous driving systemsAlphaSeq: Sequence Discovery with Deep Reinforcement LearningExploration versus exploitation in reinforcement learning: a stochastic control approachMulti-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless NetworksEnergy-Efficient Thermal Comfort Control in Smart Buildings via Deep Reinforcement LearningRelative Importance Sampling For Off-Policy Actor-Critic in Deep Reinforcement LearningAutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement LearningImproving Coordination in Multi-Agent Deep Reinforcement Learning through Memory-driven CommunicationLow Level Control of a Quadrotor with Deep Model-Based Reinforcement learningAccelerated Methods for Deep Reinforcement LearningMotion Perception in Reinforcement Learning with Dynamic ObjectsA New Tensioning Method using Deep Reinforcement Learning for Surgical Pattern CuttingMachine Teaching for Inverse Reinforcement Learning: Algorithms and ApplicationsNear-Optimal Representation Learning for Hierarchical Reinforcement LearningMulti-Agent Reinforcement Learning via Double Averaging Primal-Dual OptimizationDeterministic Implementations for Reproducibility in Deep Reinforcement LearningUncertainty-Based Out-of-Distribution Detection in Deep Reinforcement LearningRisk-Aware Active Inverse Reinforcement LearningA dual mode adaptive basal-bolus advisor based on reinforcement learningWhat Should I Do Now? Marrying Reinforcement Learning and Symbolic PlanningDeep Reinforcement Learning for Imbalanced ClassificationHierarchical Reinforcement Learning via Advantage-Weighted Information MaximizationFinite-Sample Analyses for Fully Decentralized Multi-Agent Reinforcement LearningOptimal Decision-Making in Mixed-Agent Partially Observable Stochastic Environments via Reinforcement LearningFloyd-Warshall Reinforcement Learning: Learning from Past Experiences to Reach New GoalsA Critical Investigation of Deep Reinforcement Learning for NavigationAccelerating Goal-Directed Reinforcement Learning by Model CharacterizationMachine Teaching in Hierarchical Genetic Reinforcement Learning: Curriculum Design of Reward Functions for Swarm ShepherdingReinforcement Learning Using Quantum Boltzmann MachinesCommunication-Efficient Distributed Reinforcement LearningDeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic NavigationHuman-Like Autonomous Car-Following Model with Deep Reinforcement LearningAdversarial Text Generation Without Reinforcement LearningEnd-to-End Video Captioning with Multitask Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning.arxivA Deep Reinforcement Learning Chatbot (Short Version).arxivAlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search.arxiv⭐️A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress.arxivComposable Deep Reinforcement Learning for Robotic Manipulation.arxivCooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication.arxivDeep Reinforcement Fuzzing.arxivDeep Reinforcement Learning of Cell Movement in the Early Stage of C. elegans Embryogenesis.arxivDeep Reinforcement Learning For Sequence to Sequence Models.arxivcodeDeep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods.arxivDeep Reinforcement Learning in Portfolio Management.arxivcodeDeep Reinforcement Learning using Capsules in Advanced Game Environments.arxivDeep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft.arxivDistributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes.arxivcodeDiversity is All You Need: Learning Skills without a Reward Function.arxivFaster Deep Q-learning using Neural Episodic Control.arxivFeedback-Based Tree Search for Reinforcement Learning.arxivFeudal Reinforcement Learning for Dialogue Management in Large Domains.arxivForward-Backward Reinforcement Learning.arxivHierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies.arxivIMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures.arxivKickstarting Deep Reinforcement Learning.arxivLearning a Prior over Intent via Meta-Inverse Reinforcement Learning.arxivMeta Reinforcement Learning with Latent Variable Gaussian Processes.arxivMulti-Agent Reinforcement Learning: A Report on Challenges and Approaches.arxivPretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations.arxivPsychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents.arxivRecommendations with Negative Feedback via Pairwise Deep Reinforcement Learning.arxivReinforcement Learning and Control as Probabilistic Inference: Tutorial and Review.arxivReinforcement Learning from Imperfect Demonstrations.arxivReinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application.arxivRUDDER: Return Decomposition for Delayed Rewards.arxivcodeSemi-parametric Topological Memory for Navigation.arxivtensorflowShared Autonomy via Deep Reinforcement Learning.arxivSetting up a Reinforcement Learning Task with a Real-World Robot.arxivSimple random search provides a competitive approach to reinforcement learning.arxivcodeUnsupervised Meta-Learning for Reinforcement Learning.arxivUsing reinforcement learning to learn how to play text-based games.arxiv

A Deep Reinforcement Learning Chatbot.arxivA Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem.arxivcodeA Deep Reinforced Model for Abstractive Summarization.arxivA Distributional Perspective on Reinforcement Learning.arxivA Laplacian Framework for Option Discovery in Reinforcement Learning.arxiv⭐️Boosting the Actor with Dual Critic.arxivBridging the Gap Between Value and Policy Based Reinforcement Learning.arxivCar Racing using Reinforcement Learning.pdfCold-Start Reinforcement Learning with Softmax Policy Gradients.arxivCuriosity-driven Exploration by Self-supervised Prediction.arxivtensorflowDeep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning.arxivcodeDeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning.arxivcodeDeep Reinforcement Learning: An Overview.arxivDeep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward.arxivcodeDeep reinforcement learning from human preferences.arxivDeep Reinforcement Learning that Matters.arxivcodeDevice Placement Optimization with Reinforcement Learning.arxivDistributional Reinforcement Learning with Quantile Regression.arxivEnd-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning.arxivEvolution Strategies as a Scalable Alternative to Reinforcement Learning.arxivFeature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning.arxivLearning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations.arxivLearning how to Active Learn: A Deep Reinforcement Learning Approach.arxivtensorflowLearning Multimodal Transition Dynamics for Model-Based Reinforcement Learning.arxivtensorflowMAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence.arxivcode⭐️Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm.arxivMicro-Objective Learning : Accelerating Deep Reinforcement Learning through the Discovery of Continuous Subgoals.arxivNeural Architecture Search with Reinforcement Learning.arxivtensorflowNeural Map: Structured Memory for Deep Reinforcement Learning.arxivObservational Learning by Reinforcement Learning.arxivOvercoming Exploration in Reinforcement Learning with Demonstrations.arxivPractical Network Blocks Design with Q-Learning.arxivRainbow: Combining Improvements in Deep Reinforcement Learning.arxivReinforcement Learning for Architecture Search by Network Transformation.arxivcodeReinforcement Learning via Recurrent Convolutional Neural Networks.arxivcodeReinforcement Learning with a Corrupted Reward Channel.arxiv⭐️Reinforcement Learning with Deep Energy-Based Policies.arxivcodeReinforcement Learning with External Knowledge and Two-Stage Q-functions for Predicting Popular Reddit Threads.arxivRobust Deep Reinforcement Learning with Adversarial Attacks.arxivSeq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning.arxivShallow Updates for Deep Reinforcement Learning.arxivcodeStochastic Neural Networks for Hierarchical Reinforcement Learning.pdfcodeTackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency Parsing.arxivcodeTask-Oriented Query Reformulation with Reinforcement Learning.arxivcodeTeaching a Machine to Read Maps with Deep Reinforcement Learning.arxivcodeTreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning.arxivcodeValue Prediction Network.arxivVariational Deep Q Network.arxivVirtual-to-real Deep Reinforcement Learning: Continuous Control of Mobile Robots for Mapless Navigation.arxivZero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning.arxiv

Asynchronous Methods for Deep Reinforcement Learning. [arxiv] ⭐️Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR. [arxiv]A New Softmax Operator for Reinforcement Learning.[url]Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al.,ICML. [arxiv]Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR. [arxiv]Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR. [arxiv]Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv. [url]Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML. [arxiv]Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al.,ICML. [arxiv]Continuous control with deep reinforcement learning. [arxiv] ⭐️Deep Successor Reinforcement Learning. [arxiv]Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop. [arxiv]Deep Exploration via Bootstrapped DQN. [arxiv] ⭐️Deep Reinforcement Learning for Dialogue Generation. [arxiv]tensorflowDeep Reinforcement Learning in Parameterized Action Space. [arxiv] ⭐️Deep Reinforcement Learning with Successor Features for Navigation across Similar Environments.[url]Designing Neural Network Architectures using Reinforcement Learning.arxivcodeDialogue manager domain adaptation using Gaussian process reinforcement learning. [arxiv]End-to-End Reinforcement Learning of Dialogue Agents for Information Access. [arxiv]Generating Text with Deep Reinforcement Learning. [arxiv]Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv. [arxiv]Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv. [arxiv]Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv. [arxiv]Hierarchical Object Detection with Deep Reinforcement Learning. [arxiv]High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR. [arxiv]Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI. [arxiv]Interactive Spoken Content Retrieval by Deep Reinforcement Learning. [arxiv]Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv. [url]Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv. [url]Learning to compose words into sentences with reinforcement learning. [url]Loss is its own Reward: Self-Supervision for Reinforcement Learning.[arxiv]Model-Free Episodic Control. [arxiv]Mastering the game of Go with deep neural networks and tree search. [nature] ⭐️MazeBase: A Sandbox for Learning from Games .[arxiv]Neural Architecture Search with Reinforcement Learning. [pdf]Neural Combinatorial Optimization with Reinforcement Learning. [arxiv]Non-Deterministic Policy Improvement Stabilizes Approximated Reinforcement Learning. [url]Online Sequence-to-Sequence Active Learning for Open-Domain Dialogue Generation.arXiv. [arxiv]Policy Distillation, A. A. Rusu et at.,ICLR. [arxiv]Prioritized Experience Replay. [arxiv] ⭐️Reinforcement Learning Using Quantum Boltzmann Machines. [arxiv]Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.[arxiv]Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving. [arxiv]Sample-efficient Deep Reinforcement Learning for Dialog Control. [url]Self-Correcting Models for Model-Based Reinforcement Learning.[url]Unifying Count-Based Exploration and Intrinsic Motivation. [arxiv]Value Iteration Networks. [arxiv]

ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources.arxivAction-Conditional Video Prediction using Deep Networks in Atari Games.arxiv⭐️Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning.arxiv⭐️[DDPG]Continuous control with deep reinforcement learning.arxiv⭐️[NAF]Continuous Deep Q-Learning with Model-based Acceleration.arxiv⭐️Dueling Network Architectures for Deep Reinforcement Learning.arxiv⭐️Deep Reinforcement Learning with an Action Space Defined by Natural Language.arxivDeep Reinforcement Learning with Double Q-learning.arxiv⭐️Deep Recurrent Q-Learning for Partially Observable MDPs.arxiv⭐️DeepMPC: Learning Deep Latent Features for Model Predictive Control.pdfDeterministic Policy Gradient Algorithms.pdf⭐️Dueling Network Architectures for Deep Reinforcement Learning.arxivEnd-to-End Training of Deep Visuomotor Policies.arxiv⭐️Giraffe: Using Deep Reinforcement Learning to Play Chess.arxivGenerating Text with Deep Reinforcement Learning.arxivHow to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies.arxivHuman-level control through deep reinforcement learning.nature⭐️Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models.arxiv⭐️Learning Simple Algorithms from Examples.arxivLanguage Understanding for Text-based Games Using Deep Reinforcement Learning.pdf⭐️Learning Continuous Control Policies by Stochastic Value Gradients.pdf⭐️Multiagent Cooperation and Competition with Deep Reinforcement Learning.arxivMaximum Entropy Deep Inverse Reinforcement Learning.arxivMassively Parallel Methods for Deep Reinforcement Learning.pdf] ⭐️On Learning to Think- Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models.arxivPlaying Atari with Deep Reinforcement Learning.arxivRecurrent Reinforcement Learning: A Hybrid Approach.arxivStrategic Dialogue Management via Deep Reinforcement Learning.arxivTowards Vision-Based Deep Reinforcement Learning for Robotic Motion Control.arxivTrust Region Policy Optimization.pdf⭐️Universal Value Function Approximators.pdfVariational Information Maximisation for Intrinsically Motivated Reinforcement Learning.arxiv

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning.[url]

Evolving large-scale neural networks for vision-based reinforcement learning. [idsia] ⭐️Playing Atari with Deep Reinforcement Learning. [toronto] ⭐️

More About

These documents will be updated in sync with my personal blog and knowledge column

CSDN-Blog: A Guide Resource for Deep Reinforcement LearningZhiHu-Blog: A Guide Resource for Deep Reinforcement LearningWeChat(Add account: “NeuronDance”, remark “Name-University/Company”)

Cite

Based on the above information, we have made a comprehensive summary of the deep reinforcement of learning materials, and we would like to express our heartfelt thanks to them.

[1]./brianspiering/awesome-deep-rl

[2]./jgvictores/awesome-deep-reinforcement-learning

[3]./PaddlePaddle/PARL/blob/develop/papers/archive.md#distributed-training

[4]./LantaoYu/MARL-Papers

[5]./gopala-kr/DRL-Agents

[6]./junhyukoh/deep-reinforcement-learning-papers

[7]./ai/metrics#Source-Code

[8].https://agi.university/the-landscape-of-deep-reinforcement-learning

[9]./tigerneil/awesome-deep-rl

[10]./0830-berkeley_deep_rl_bootcamp/

[11]./awesome-rl/

[12]./junhyukoh/deep-reinforcement-learning-papers

</div><link href="/release/phoenix/mdeditor/markdown_views-b6c3c6d139.css" rel="stylesheet"></div>

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。