34 lines
817 B
Markdown
34 lines
817 B
Markdown
|
# Lunar Lander
|
||
|
|
||
|
This is an example of an Actor-Critic learning Agent as part of the ASIM RL Tutorial
|
||
|
|
||
|
It uses gym for the environment and torch as the basis for the A-C Network
|
||
|
|
||
|
### Action Space
|
||
|
1) do nothing,
|
||
|
2) fire left engine,
|
||
|
3) fire bottom engine,
|
||
|
4) fire right engine
|
||
|
|
||
|
### Observation Space
|
||
|
1) x,
|
||
|
2) y,
|
||
|
3) linear velocity in x,
|
||
|
4) linear velocity in y,
|
||
|
5) angle,
|
||
|
6) angular velocity
|
||
|
7) ground contact leg 1
|
||
|
8) ground contact leg 2
|
||
|
|
||
|
### Rewards
|
||
|
1) landing at landing pad +[100-140] points.
|
||
|
2) crash: -100 points
|
||
|
3) landing: +100 points
|
||
|
4) landing on leg: +10 points per leg
|
||
|
5) firing an engine: -0.3 points per engine per frame
|
||
|
|
||
|
it counts as solved at 200 points
|
||
|
|
||
|
### Starting State
|
||
|
The lander starts at the top center of the viewport with
|
||
|
a random initial force applied to its center of mass.
|