FAQ

Version

1.0

Date

12-31-2021

Contributors

Steven Li, Xiao-Yang Liu

Description

This document contains the most frequently asked questions related to the ElegantRL Library, based on questions posted on the slack channels and Github issues.

Outline

Section 1 Where to start?

  • Get started with ElegantRL-helloworld, a lightweight and stable subset of ElegantRL.

    • Read the introductary post of ElegantRL-helloworld.

    • Read the post to learn how an algorithm is implemented.

    • Read the posts (Part I, Part II) to learn a demo of ElegantRL-helloworld on a stock trading task.

  • Read the post and the paper that describe our cloud solution, ElegantRL-Podracer.

  • Run the Colab-based notebooks on simple Gym environments.

  • Install the library following the instructions at the official Github repo.

  • Run the demos from MuJoCo to Isaac Gym provided in the library folder.

  • Enter on the AI4Finance slack.

Section 2 What to do when you experience problems?

  • If any questions arise, please follow this sequence of activities:

    • Check if it is not already answered on this FAQ

    • Check if it is not posted on the Github repo issues.

    • If you cannot find your question, please report it as a new issue or ask it on the AI4Finance slack (Our members will get to you ASAP).

Section 4 References for diving deep into Deep Reinforcement Learning (DRL)

Subsection 4.1 Open-source softwares and materials

Subsection 4.2 DRL algorithms

  • David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Mastering the game of Go without human knowledge. Nature, 550(7676):354–359, 2017.

    1. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, Ioannis Antonoglou, Daan Wierstra, and Martin A. Riedmiller. Playing atari with deep reinforcement learning. ArXiv, abs/1312.5602, 2013.

      1. Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with double q-learning. ArXiv, abs/1509.06461, 2016.

  • Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. In ICLR, 2016.

    1. Schulman, F. Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. ArXiv, abs/1707.06347, 2017.

  • Matteo Hessel, Joseph Modayil, H. V. Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan,Bilal Piot, Mohammad Gheshlaghi Azar, and David Silver. Rainbow: Combining improvements in deepreinforcement learning. In AAAI, 2018.

  • Scott Fujimoto, Herke Hoof, and David Meger. Addressing function approximation error in actor-critic methods. In International Conference on Machine Learning, pages 1587–1596. PMLR, 2018.

  • Tuomas Haarnoja, Aurick Zhou, P. Abbeel, and Sergey Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In ICML, 2018.

  • Xinyue Chen, Che Wang, Zijian Zhou, and Keith W. Ross. Randomized ensembled double q-learning: Learning fast without a model. In International Conference on Learning Representations, 2021.

Subsection 4.2 Other resources

  • Richard S. Sutton and Andrew G. Barto. Reinforcement learning: An introduction. IEEE Transactions on Neural Networks, 16:285–286, 2005.

  • Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, Mustafa Suleyman, Charlie Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, and David Silver. Massively parallel methods for deep reinforcement learning. ArXiv, abs/1507.04296, 2015.

  • Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, et al. Ray: A distributed framework for emerging ai applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 561–577, 2018.

  • Lasse Espeholt, Rapha¨el Marinier, Piotr Stanczyk, Ke Wang, and Marcin Michalski. Seed rl: Scalable and efficient deep-rl with accelerated central inference. In International Conference on Machine Learning. PMLR, 2019.

  • Agrim Gupta, Silvio Savarese, Surya Ganguli, and Fei-Fei Li. Embodied intelligence via learning and evolution. Nature Communications, 2021.

  • Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, and Hado van Hasselt. Podracer architectures for scalable reinforcement learning. arXiv preprint arXiv:2104.06272, 2021.

  • Zechu Li, Xiao-Yang Liu, Jiahao Zheng, Zhaoran Wang, Anwar Walid, and Jian Guo. FinRL-podracer: High performance and scalable deep reinforcement learning for quantitative finance. ACM International Conference on AI in Finance (ICAIF), 2021.

  • Nikita Rudin, David Hoeller, Philipp Reist, and Marco Hutter. Learning to walk in minutes using massively parallel deep reinforcement learning. In Conference on Robot Learning, 2021.

  • Brijen Thananjeyan, Kirthevasan Kandasamy, Ion Stoica, Michael I. Jordan, Ken Goldberg, and Joseph Gonzalez. Resource allocation in multi-armed bandit exploration: Overcoming nonlinear scaling with adaptive parallelism. In ICML, 2021.

Section 5 Common issues/bugs

  • When running Isaac Gym, found error ImportError: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory:

    Run the following code in bash to add the path of Isaac Gym conda environment.

    export LD_LIBRARY_PATH=$PATH$

    For example, the name of Isaac Gym conda environment is rlgpu:

    export LD_LIBRARY_PATH=/xfs/home/podracer_steven/anaconda3/envs/rlgpu/lib