Referências e ferramentas
Nesta página estão listadas as principais referências e ferramentas utilizadas nesta disciplina.
Livros
-
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). The MIT Press.
-
Stefano V. Albrecht, Filippos Christianos, and Lukas Schäfer. Multi-Agent Reinforcement Learning: Foundations and Modern Approaches. MIT Press, 2024.
-
Mitchell, T. (1997). Reinforcement Learning in Machine Learning. McGraw-Hill.
-
NORVIG, P.; RUSSELL, S., Inteligência Artificial, 3ª ed., Campus Elsevier, 2013
Artigos
- Watkins, C.J.C.H., Dayan, P. Q-Learning. Mach Learn 8, 279–292 (1992). https://doi.org/10.1007/BF00992698
-
Williams, R.J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229–256 (1992). https://doi.org/10.1007/BF00992696.
-
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. 2013 Dec 19.
-
Mnih, V., Kavukcuoglu, K., Silver, D. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015). https://doi.org/10.1038/nature14236
-
Schulman J, Levine S, Abbeel P, Jordan M, Moritz P. Trust region policy optimization. In International conference on machine learning 2015 Jun 1 (pp. 1889-1897). PMLR.
-
van Hasselt, H., Guez, A. and Silver, D. 2016. Deep Reinforcement Learning with Double Q-Learning. Proceedings of the AAAI Conference on Artificial Intelligence. 30, 1 (Mar. 2016). DOI: https://doi.org/10.1609/aaai.v30i1.10295.
-
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347. 2017 Jul 20.
-
Silver D. et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362,1140-1144 (2018). DOI:10.1126/science.aar6404.
-
Silver D., Singh S., Precup D., Sutton R. Reward is enough, Artificial Intelligence, Volume 299, 2021, https://doi.org/10.1016/j.artint.2021.103535.
-
M. Mehdi Afsar, Trafford Crump, and Behrouz Far. 2022. Reinforcement Learning based Recommender Systems: A Survey. ACM Comput. Surv. 55, 7, Article 145 (July 2023), 38 pages. https://doi.org/10.1145/3543846
-
Shuo Sun, Rundong Wang, and Bo An. 2023. Reinforcement Learning for Quantitative Trading. ACM Trans. Intell. Syst. Technol. 14, 3, Article 44 (June 2023), 29 pages. https://doi.org/10.1145/3582560
-
Andrej Karpathy. Deep Reinforcement Learning: Pong from Pixels. Disponível em http://karpathy.github.io/2016/05/31/rl/. Acessado a última vez em 30 de abril de 2023.
-
Antonin Raffin and Ashley Hill and Adam Gleave and Anssi Kanervisto and Maximilian Ernestus and Noah Dormann Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research, 2021. http://jmlr.org/papers/v22/20-1364.html.
Implementações e tutoriais
-
Deep Reinforcement Learning from OpenAI. Disponível em https://spinningup.openai.com/en/latest/index.html. Acessado a última vez em 30 de abril de 2023.
-
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W. Openai gym. arXiv preprint arXiv:1606.01540. 2016 Jun 5.
-
Raffin R., Hill A., Gleave A., Kanervisto A., Ernestus M., Dormann N. Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research, 2021. http://jmlr.org/papers/v22/20-1364.html.
-
Andrej Karpathy. Deep Reinforcement Learning: Pong from Pixels. Disponível em http://karpathy.github.io/2016/05/31/rl/. Acessado a última vez em 30 de abril de 2023.
-
Deep Reinforcement Learning from OpenAI. Disponível em https://spinningup.openai.com/en/latest/index.html. Acessado a última vez em 30 de abril de 2023.
-
The 37 Implementation Details of Proximal Policy Optimization. Disponível https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/. Último acesso em maio de 2023.
-
Understanding Proximal Policy Optimization (Schulman et al., 2017). Disponível https://blog.tylertaewook.com/post/proximal-policy-optimization. Último acesso em maio de 2023.
-
Simonini, T. Proximal Policy Optimization (PPO). Unit 8, of the Deep Reinforcement Learning Class with Hugging Face. Disponível em https://huggingface.co/blog/deep-rl-ppo. Último acesso em maio de 2023.
-
Deep Reinforcement Learning Class with Hugging Face. Disponível em https://huggingface.co/learn/deep-rl-course/unit0/introduction. Último acesso em fevereiro de 2024.
Ferramentas
- The Farama Foundation: this group is responsible for maintaining the Gymnasium and PettingZoo projects.
- Kaggle Environments Project.
- How to use Gymnasium API: a Python library for single agent reinforcement learning.
- SuperSuit: wrappers for RL environments.
- CleanRL: site com diversas implementações de algoritmos de RL.
- Worldgen: Emergent tool use from multi-agent interaction.
- Highway envs.
- Tianshou is a reinforcement learning platform based on pure PyTorch.
- Unity Machine Learning Agents.
- Reinforcement learning for Recommendation Systems.
- FlatLand.
- Drone Swarm Search Environment.
- MARLlib environments.
- Multi-Robot Warehouse Environments.