Reinforcement Learning
Syllabus
Reinforcement Learning. Reinforcement Learning Algorithms. Implementation of autonomous agents using reinforcement learning.
Objectives
By the end of the course, students will be able to:
- Build a reinforcement learning-based system for sequential decision-making.
- Understand how to formalize a task as a reinforcement learning problem, how to implement a solution, and how to evaluate it.
- Understand the types of reinforcement learning algorithms: value-based, policy gradient, and actor-critic.
- Understand the relationship between reinforcement learning and supervised and unsupervised learning.
Course Content
- Introduction to Reinforcement Learning.
- Implementation of autonomous agents using reinforcement learning.
- Taxonomy of reinforcement learning algorithms.
- Q-Learning Algorithm.
- Sarsa Algorithm.
- Deep Reinforcement Learning.
- Deep Q-Learning algorithms.
- Reinforce: a Policy Gradient algorithm.
- Actor-Critic algorithms.
- Implementation of autonomous agents using projects such as Farama's Gymnasium and Kaggle's reinforcement learning library.
- Examples of solutions using reinforcement learning.
Required Bibliography
- SUTTON, R.; BARTO, A. Reinforcement Learning: An Introduction. Second Edition. The MIT Press, 2018.
- GÉRON, A. Hands-on Machine Learning with Scikit-learn, Keras, and TensorFlow, 2nd ed., O'Reilly, 2021.
- Van Hasselt, H., Guez, A. and Silver, D., 2016, March. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1).
- Schulman, J., Wolski, F., Dhariwal, P., Radford, A. and Klimov, O., 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
- Brockman, G. et al., 2016. Openai gym. arXiv preprint arXiv:1606.01540.
Supplementary Bibliography
- Laura Graesser and Wah Loon Keng. 2019. Foundations of Deep Reinforcement Learning: Theory and Practice in Python (1st. ed.). Addison-Wesley Professional.
- NORVIG, P.; RUSSELL, S., Artificial Intelligence: A Modern Approach, 3rd ed., Prentice Hall, 2009.
- SILVER, D.; SINGH S.; PRECUP D.; SUTTON R. Reward is enough. Artificial Intelligence. Vol 299, 2021.
- MuZero: Mastering Go, chess, shogi and Atari without rules. Published in December, 2020.
- SILVER, D.; HUBERT T.; SCHRITTWIESER, J.; ANTONOGLOU, I.; LAI, M.; GUEZ, A. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 1140-1144 (2018).
- Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D. and Riedmiller, M., 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
- Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, and David Meger. 2018. Deep reinforcement learning that matters. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence (AAAI'18/IAAI'18/EAAI'18). AAAI Press, Article 392, 3207–3214.
- Dohare, S., Hernandez-Garcia, J.F., Lan, Q. et al. Loss of plasticity in deep continual learning. Nature 632, 768–774 (2024). https://doi.org/10.1038/s41586-024-07711-7