34 lines
		
	
	
		
			817 B
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			34 lines
		
	
	
		
			817 B
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Lunar Lander
 | |
| 
 | |
| This is an example of an Actor-Critic learning Agent as part of the ASIM RL Tutorial
 | |
| 
 | |
| It uses gym for the environment and torch as the basis for the A-C Network
 | |
| 
 | |
| ### Action Space
 | |
|  1) do nothing,
 | |
|  2) fire left engine,
 | |
|  3) fire bottom engine,
 | |
|  4) fire right engine
 | |
| 
 | |
| ### Observation Space
 | |
|  1) x,
 | |
|  2) y,
 | |
|  3) linear velocity in x,
 | |
|  4) linear velocity in y,
 | |
|  5) angle,
 | |
|  6) angular velocity
 | |
|  7) ground contact leg 1
 | |
|  8) ground contact leg 2
 | |
| 
 | |
| ### Rewards
 | |
|  1) landing at landing pad +[100-140] points.
 | |
|  2) crash: -100 points
 | |
|  3) landing: +100 points
 | |
|  4) landing on leg: +10 points per leg
 | |
|  5) firing an engine: -0.3 points per engine per frame
 | |
| 
 | |
| it counts as solved at 200 points
 | |
| 
 | |
| ### Starting State
 | |
| The lander starts at the top center of the viewport with
 | |
|  a random initial force applied to its center of mass. |