GitHub - kumar-sanjeeev/reinforcement-learning: This repo contains implementation of algorithms that I have learned in my course work of Reinforcment learning

Reinforcement learning techniques

Implemented the VALUE ITERATION, POLICY ITERATION and Q LEARNING on the grid World Environment.

Table of content

Getting Started
Information about main code files
Quick Start GridWorld
Policy Iteration
Value Iteration
Q-Learning
Resources

Getting Started

Clone the repo :

$ git clone https://github.com/kumar-sanjeeev/reinforcement-learning.git

Install the dependencies by running requirements.txt file

$ pip install -r requirements.txt

Information about main code files

agent.py, RandomAgent.py : General class as template for agents and a completed RandomAgent
ValueIterationAgent.py : value iteration agent
PolicyIterationAgent.py : policy iteration agent
QLearningAgent.py : q learning agent
output_xxx_selftest.txt : output of specific runs to check your implementation
solution_xxx.py : result obtained
mdp.py : abstract clas for general MDPs
environmrnt.py : abstract class for general reinforcement learning environments
gridworld.py : gridworld main code and test harness
gridworldclass.py : implementation of gridworld internals
utils.py : some utility code

Quick Start GridWorld

To get started run the gridworld in interactive mode:

move the agent with arrow keys

python3 gridworld.py -m

Control the different aspects of the simulation. A full list is available by running:

python3 gridworld.py -h

Options:
  -h, --help            show this help message and exit
  -d DISCOUNT, --discount=DISCOUNT
                        Discount on future (default 0.9)
  -r R, --livingReward=R
                        Reward for living for a time step (default 0.0)
  -n P, --noise=P       How often action results in unintended direction
                        (default 0.2)
  -e E, --epsilon=E     Chance of taking a random action in q-learning
                        (default 0.3)
  -l P, --learningRate=P
                        TD learning rate (default 0.5)
  -i K, --iterations=K  Number of rounds of policy evaluation or value
                        iteration (default 10)
  -k K, --episodes=K    Number of epsiodes of the MDP to run (default 0)
  -g G, --grid=G        Grid to use (case sensitive; options are BookGrid,
                        BridgeGrid, CliffGrid, MazeGrid, CustomGrid, default
                        BookGrid)
  -w X, --windowSize=X  Request a window width of X pixels *per grid cell*
                        (default 150)
  -a A, --agent=A       Agent type (options are 'random', 'value' ,
                        'policyiter' and 'q', default random)
  -t, --text            Use text-only ASCII display
  -p, --pause           Pause GUI after each time step when running the MDP
  -q, --quiet           Skip display of any learning episodes
  -s S, --speed=S       Speed of animation, S > 1.0 is faster, 0.0 < S < 1.0
                        is slower (default 1.0)
  -m, --manual          Manually control agent (for lecture)

Policy Iteration

Run the policy iteration agent on the following paramters:

python3 gridworld.py -a policyiter

Final Result

State Values

Q-Values

Value Iteration

Run the Value iteration agent on the MazeGrid environment:

State Values

python3 gridworld.py -a value -g MazeGrid

Q Values

Q-Learning

Run the Value iteration agent on the MazeGrid environment:

State Values

python3 gridworld.py -a q -g MazeGrid -k 100 -q

Q Values

Resources

This problem was given in the homework assignment of REINFORCEMENT LEARNING and LEARNING BASED CONTROL COURSE at my university

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
bin		bin
lib/python3.8/site-packages		lib/python3.8/site-packages
share/python-wheels		share/python-wheels
src		src
README.md		README.md
lib64		lib64
pyvenv.cfg		pyvenv.cfg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of content

Getting Started

Information about main code files

Quick Start GridWorld

Policy Iteration

Final Result

Value Iteration

Q-Learning

Resources

About

Releases

Packages

Languages

kumar-sanjeeev/reinforcement-learning

Folders and files

Latest commit

History

Repository files navigation

Table of content

Getting Started

Information about main code files

Quick Start GridWorld

Policy Iteration

Final Result

Value Iteration

Q-Learning

Resources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages