以下是基于OpenAI Spinning Up整理并扩充的单智能体深度强化学习中值得阅读的论文列表以及我的阅读总结与思考。该列表并不全面,但建议对强化学习感兴趣的同学朋友们进行阅读~
- Model-Free RL
- Deep Q-Learning
- DQN: Playing Atari with Deep Reinforcement Learning
- DRQN:Deep Recurrent Q-Learning for Partially Observable MDPs
- Dueling DQN:Dueling Network Architectures for Deep Reinforcement Learning
- Double DQN:Deep Reinforcement Learning with Double Q-learning
- Prioritized Experience Replay (PER):Prioritized Experience Replay
- Rainbow DQN:Rainbow Combining Improvements in Deep Reinforcement Learning
- Policy Gradients
- A3C: Asynchronous Methods for Deep Reinforcement Learning
- TRPO: Trust Region Policy Optimization
- GAE: High-Dimensional Continuous Control Using Generalized Advantage Estimation
- PPO-Clip, PPO-Penalty: Proximal Policy Optimization Algorithms
- PPO-Penalty: Emergence of Locomotion Behaviours in Rich Environments
- ACKTR: Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
- ACER: Sample Efficient Actor-Critic with Experience Replay
- SAC: Soft Actor-Critic Off-Policy Maximum Entropy Deep Reinforcement Learning With a Stochastic Actor
- ERO: Experience Replay Optimization
- Deterministic Policy Gradients
- Distributional RL
- Policy Gradients with Action-Dependent Baselines
- Path-Consistency Learning
- Other Directions for Combining Policy-Learning and Q-Learning
- Evolutionary Algorithms
- Deep Q-Learning
- Exploration
- Count based Exploration
- Curiosity based Exploration
- Information Gain
- Transfer and Multitask RL
- Hierarchy RL
- Memory
- Model-Based RL
- Meta-RL
- Scaling RL
- Offline RL
- Off-Policy Deep Reinforcement Learning Without Exploration
- Benchmarking Batch Deep Reinforcement Learning Algorithms
- Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
- An Optimistic Perspective on Offline Reinforcement Learning
- IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data
- Behavior Regularized Offline Reinforcement Learning
- Representation Learning in RL
- Contrastive Learning
- Others
- Generalized RL
- RL in the Real World
- Safety
- Imitation Learning and Inverse Reinforcement Learning
- Reproducibility, Analysis, and Critique
- Classic Papers in RL Theory or Review
您的打赏是对我最大的鼓励!