A Python-based reinforcement learning project that trains a Mars rover to navigate a grid environment, collect samples, and avoid obstacles using Deep Q-Learning (DQN).
- Grid-based Mars environment simulation
- Deep Q-Network (DQN) implementation for rover control
- Support for both CPU and GPU training
- Automated results logging and visualization
- Real-time training progress tracking
- Console-based environment rendering option
mars_rover/
├── src/
│ ├── rover.py # Core DQN agent and environment implementation
│ └── results_handler.py # Training results and metrics management
├── main.py # Main training script with CLI interface
├── requirements.txt # Project dependencies
└── README.md # Project documentation
- Python 3.8+
- TensorFlow 2.x
- NumPy
- Matplotlib
- GPU support (optional, for faster training)
- Clone the repository:
git clone https://github.com/prateekshukla1108/mars-rover.git
cd mars-rover
- Create and activate a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
Run the training script with default parameters:
python main.py
The training script supports several command-line arguments:
--episodes
: Number of training episodes (default: 100)--grid-size
: Size of the environment grid (default: 10)--max-steps
: Maximum steps per episode (default: 500)--render
: Enable console-based visualization--results-dir
: Directory to save results (default: 'results')
Example with custom parameters:
python main.py --episodes 200 --grid-size 15 --render --results-dir custom_results
The Mars environment is represented as a grid where:
- Empty spaces are marked as '·'
- Obstacles (rocks) are marked as '▲'
- Sample collection points are marked as '◆'
- The rover position is marked as '█'
- Collecting a sample: +20 points
- Collecting all samples: +50 bonus points
- Hitting an obstacle: -10 points
- Hitting boundary: -5 points
- Each move: -1 point (encourages efficient paths)
The results handler automatically saves:
- Episode-by-episode metrics in CSV format
- Training configuration and device information in JSON
- Training visualization plots
- Final summary statistics
Results are saved in timestamped directories under the specified results directory.
The DQN implementation features:
- Experience replay buffer
- Epsilon-greedy exploration
- Automatic GPU detection and utilization
- Mixed precision training when GPU is available
- Automatic GPU detection and configuration
- Memory growth management for GPU training
- Optimized batch processing
- Efficient state representation
To extend or modify the project:
- Environment modifications can be made in
src/rover.py
classMarsEnvironment
- Agent modifications can be made in
src/rover.py
classDQNAgent
- Results handling can be customized in
src/results_handler.py