An educational interface for building intuition about reinforcement learning fundamentals. The backend is build on the environments of the gymnasium library.
Current Implementation:
- environments
- Gymnasium FrozenLake-v1 (4x4) with
is_slippery=True - Gymnasium FrozenLake-v1 (4x4) with
is_slippery=False
- Gymnasium FrozenLake-v1 (4x4) with
- algorithms
- Q-learning (custom build)
Already have git and Docker installed? Get started in 3 commands:
Prerequisites: Docker Desktop running or Docker Engine
- Clone the repository
git clone https://github.com/aihpi/workshop-rl1-introduction.git- Navigate inside
cd workshop-rl1-introduction- Is Docker running? Then you can start the app (detached mode)
docker compose up -d- Open browser to http://localhost:3030
First-time setup takes ~1-2 minutes (downloads pre-built images).
Note: Running in detached mode (-d) keeps your terminal free. To view logs if needed for debugging, open a separate terminal and run docker compose logs -f
New to programming or Docker? Follow the installation guides:
|
For Windows 10/11 ~10-15 minutes |
For macOS 10.15+ ~10-15 minutes |
For Ubuntu/Debian ~15-20 minutes |
Once installed, here are some helpful commands:
docker compose up -d # Start the application (detached mode)
docker compose down # Stop the application
docker compose logs -f # View live logs (for debugging, in separate terminal)
docker compose logs backend # View only backend logs
docker compose logs frontend # View only frontend logs
docker compose ps # Check container status
docker compose restart # Restart services-
Open the application in your browser at http://localhost:3030
-
Adjust parameters using the sliders:
- Number of Episodes: Training duration
- Exploration Rate (ε): Probability of random exploration
- Learning Rate (α): How fast the agent learns
- Discount Factor (γ): Importance of future rewards
-
Start training: Click "Start Training" and watch real-time visualizations:
- Environment viewer: Renders agent's last position of a training episode
- Reward chart: Tracks training progress with statistics
- Q-table heatmap: Visualizes learned action values (4×4 grid)
-
Play policy: After training completes, click "Play Policy" to watch the trained agent execute its learned behavior step-by-step.
Backend (8 tests, 41% coverage):
# Locally
cd backend && uv run pytest
# In Docker
docker compose exec backend pytestFrontend (12 tests):
cd frontend && npm testworkshop-rl1-introduction/
├── backend/ # Python Flask backend
│ ├── algorithms/ # RL algorithm implementations
│ │ ├── base_algorithm.py # Abstract base class
│ │ └── q_learning.py # Q-Learning implementation
│ ├── environments/ # Gymnasium environment handling
│ ├── training/ # Session management
│ ├── tests/ # Backend test suite
│ └── app.py # Flask API server
├── frontend/ # React frontend
│ ├── src/
│ │ ├── components/ # React components
│ │ │ ├── ParameterPanel.jsx
│ │ │ ├── EnvironmentViewer.jsx
│ │ │ ├── RewardChart.jsx
│ │ │ ├── LearningVisualization.jsx
│ │ │ └── ControlButtons.jsx
│ │ ├── App.js # Main application
│ │ └── api.js # Backend communication
│ └── src/components/__tests__/ # Frontend test suite
├── docs/
│ ├── DEVELOPMENT.md # Local development setup (without Docker)
│ ├── INSTALLATION_LINUX.md # Linux installation guide
│ ├── INSTALLATION_MACOS.md # macOS installation guide
│ ├── INSTALLATION_WINDOWS.md # Windows installation guide
│ └── screenshots/ # Documentation screenshots
└── docker-compose.yml # Multi-container orchestration
MIT License - Free to use for educational purposes
