本文罗列了最近放出来的关于深度强化学习(Deep Reinforcement Learning,DRL)的一些论文。文章采用人工定义的方式来进行组织,按照时间的先后进行排序,越新的论文,排在越前面。希望对大家有用,同时欢迎大家提交自己阅读过的论文。


• 值函数相关的文章

• 策略相关的文章

• 离散控制相关的文章

• 连续控制相关的文章

• 文本处理领域相关的文章

• 计算机视觉领域相关的文章

• 机器人领域相关的文章

• 游戏领域相关的文章

• 蒙特卡洛树搜索相关的文章

• 逆强化学习相关的文章

• 搜索优化相关的文章

• 多任务和迁移学习相关的文章

• 多智能体相关的文章

• 层次化学习相关的文章


Model-Free Episodic Control, C. Blundell et al.,arXiv, . Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.,arXiv, . Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, . Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, . Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML, . Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop, . Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, . Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, . Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al.,ICML, . Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, . Value Iteration Networks, A. Tamar et al.,arXiv, . Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv, . Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, . Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, . Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI, . How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al.,NIPS Workshop, . Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, . Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al.,NIPS Workshop, . Learning Simple Algorithms from Examples, W. Zaremba et al.,arXiv, . Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al.,arXiv, . Prioritized Experience Replay, T. Schaul et al.,ICLR, . Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al.,arXiv, . Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR, . Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,arXiv, . Generating Text with Deep Reinforcement Learning, H. Guo,arXiv, . Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,arXiv, . Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al.,arXiv, . Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, . Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,EMNLP, . Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, . Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,arXiv, . Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, . Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,ICML Workshop, . Human-level control through deep reinforcement learning, V. Mnih et al.,Nature, . Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,NIPS Workshop, .


Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv, . Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al.,ICML, . Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv, . Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv, . Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, . Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, . Memory-based control with recurrent neural networks, N. Heess et al.,NIPS Workshop, . MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al.,arXiv, . ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,arXiv, . Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, . Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al.,NIPS, . High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR, . End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,arXiv, . Deterministic Policy Gradient Algorithms, D. Silver et al.,ICML, . Trust Region Policy Optimization, J. Schulman et al.,ICML, .


Model-Free Episodic Control, C. Blundell et al.,arXiv, . Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.,arXiv, . Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, . Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, . Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML, . Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop, . Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, . Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, . Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, . Value Iteration Networks, A. Tamar et al.,arXiv, . Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv, . Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, . Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, . Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI, . How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al.,NIPS Workshop, . Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, . Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al.,NIPS Workshop, . Learning Simple Algorithms from Examples, W. Zaremba et al.,arXiv, . Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al.,arXiv, . Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR, . Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR, . Policy Distillation, A. A. Rusu et at.,ICLR, . Prioritized Experience Replay, T. Schaul et al.,ICLR, . Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al.,arXiv, . Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR, . Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,arXiv, . Generating Text with Deep Reinforcement Learning, H. Guo,arXiv, . ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,arXiv, . Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,arXiv, . Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,arXiv, . Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al.,arXiv, . Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,EMNLP, . Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,arXiv, . Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, . Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,arXiv, . Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al.,arXiv, . Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, . Universal Value Function Approximators, T. Schaul et al.,ICML, . Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,ICML Workshop, . Human-level control through deep reinforcement learning, V. Mnih et al.,Nature, . Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,NIPS, . Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,NIPS Workshop, .


Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv, . Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al.,ICML, . Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv, . Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al.,ICML, . Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv, . Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, . Memory-based control with recurrent neural networks, N. Heess et al.,NIPS Workshop, . Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,arXiv, . Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, . Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al.,NIPS, . Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al.,arXiv, . High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR, . End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,arXiv, . DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al.,RSS, . Deterministic Policy Gradient Algorithms, D. Silver et al.,ICML, . Trust Region Policy Optimization, J. Schulman et al.,ICML, .


Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al.,NIPS Workshop, . MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al.,arXiv, . Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al.,arXiv, . Generating Text with Deep Reinforcement Learning, H. Guo,arXiv, . Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,EMNLP, . Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al.,arXiv, .


Model-Free Episodic Control, C. Blundell et al.,arXiv, . Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, . Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, . Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML, . Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop, . Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, . Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, . Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv, . Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, . Value Iteration Networks, A. Tamar et al.,arXiv, . Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, . Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, . Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI, . Memory-based control with recurrent neural networks, N. Heess et al.,NIPS Workshop, . How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al.,NIPS Workshop, . Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, . Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al.,arXiv, . Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR, . Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR, . Policy Distillation, A. A. Rusu et at.,ICLR, . Prioritized Experience Replay, T. Schaul et al.,ICLR, . Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR, . Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,arXiv, . Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,arXiv, . Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,arXiv, . Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, . Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,arXiv, . Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, . Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al.,NIPS, . Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,arXiv, . Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, . High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR, . End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,arXiv, . Universal Value Function Approximators, T. Schaul et al.,ICML, . Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,ICML Workshop, . Trust Region Policy Optimization, J. Schulman et al.,ICML, . Human-level control through deep reinforcement learning, V. Mnih et al.,Nature, . Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,NIPS, . Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,NIPS Workshop, .


Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv, . Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al.,ICML, . Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv, . Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al.,ICML, . Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv, . Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, . Memory-based control with recurrent neural networks, N. Heess et al.,NIPS Workshop, . Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,arXiv, . Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al.,NIPS, . Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al.,arXiv, . High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR, . End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,arXiv, . DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al.,RSS, . Trust Region Policy Optimization, J. Schulman et al.,ICML, .


Model-Free Episodic Control, C. Blundell et al.,arXiv, . Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.,arXiv, . Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, . Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, . Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML, . Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop, . Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, . Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, . Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, . Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv, . Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, . Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, . Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI, . How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al.,NIPS Workshop, . Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, . MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al.,arXiv, . Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al.,arXiv, . Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR, . Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR, . Policy Distillation, A. A. Rusu et at.,ICLR, . Prioritized Experience Replay, T. Schaul et al.,ICLR, . Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al.,arXiv, . Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR, . Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,arXiv, . Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,arXiv, . Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, . Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,EMNLP, . Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,arXiv, . Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, . Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,arXiv, . Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, . Universal Value Function Approximators, T. Schaul et al.,ICML, . Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,ICML Workshop, . Trust Region Policy Optimization, J. Schulman et al.,ICML, . Human-level control through deep reinforcement learning, V. Mnih et al.,Nature, . Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,NIPS, . Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,NIPS Workshop, .


Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, . Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR, . Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,NIPS, .


Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv, . Maximum Entropy Deep Inverse Reinforcement Learning, M. Wulfmeier et al.,arXiv, .


Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR, . Policy Distillation, A. A. Rusu et at.,ICLR, . ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,arXiv, . Universal Value Function Approximators, T. Schaul et al.,ICML, .


Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, . Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv, . Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, . Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, . Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, . Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, .


Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv, . Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, .


Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, . Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, . Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, .


