
As a quickstart, we select the Pendulum task from the demo.py to show how to train a DRL agent in ElegantRL.

Step 1: Import packages

from elegantrl_helloworld.demo import *

gym.logger.set_level(40) # Block warning

Step 2: Specify Agent and Environment

env = PendulumEnv('Pendulum-v0', target_return=-500)
args = Arguments(AgentSAC, env)

Part 3: Specify Hyper-parameters

args.reward_scale = 2 ** -1  # RewardRange: -1800 < -200 < -50 < 0
args.gamma = 0.97
args.target_step = args.max_step * 2
args.eval_times = 2 ** 3

Step 4: Train and Evaluate the Agent


Try by yourself through this Colab!


  • By default, it will train a stable-SAC agent in the Pendulum-v0 environment for 400 seconds.

  • It will choose to utilize CPUs or GPUs automatically. Don’t worry, we never use .cuda().

  • It will save the log and model parameters file in './{Environment}_{Agent}_{GPU_ID}'.

  • It will print the total reward while training. (Maybe we should use TensorBoardX?)

  • The code is heavily commented. We believe these comments can answer some of your questions.