Reinforcement Learning Navigating Drone

An intelligent drone navigation system using Q-Learning to autonomously locate targets in 3D environments

Features • Demo • Installation • Usage • Documentation

Overview

This project implements an advanced Q-Learning algorithm to train a virtual drone for autonomous navigation in customizable 3D environments. Originally developed as an innovative solution to a university assignment, it showcases the power of reinforcement learning in robotics applications.

For more details please read Project Report

Key Highlights

Reinforcement Learning: Implements Q-Learning with customizable hyperparameters
Real-time 3D Visualization: Interactive simulation with matplotlib and tkinter
Dynamic Retraining: Adapt to new targets without restarting
Optimized Trajectories: Intelligent path smoothing for efficient navigation
Performance Monitoring: Track training progress and replay best episodes

Features

Room Simulation: Customizable 3D room environment for drone navigation
Target Detection: Intelligent algorithm to locate a target in the simulated room
Reinforcement Learning: Implements Q-Learning for trajectory optimization
Visualization: Real-time 3D trajectory plotting for training and performance monitoring
Dynamic Updates: Allows reconfiguration of the target's location with retraining capabilities
Replay Mechanism: Replays the best navigation trajectory using generated commands

Demo

Step-by-Step Simulation Process

Step 1: Initial Configuration

When you launch ChangingTarget.py, you'll be prompted to configure:
• Room dimensions (depth, width, height)
• Target position (x, y, z coordinates)
• Drone starting position
• Number of training episodes
• Maximum steps per episode

Step 2: Training Process

The Q-Learning algorithm trains the drone through multiple episodes:
• The drone explores the environment
• Learns from successful and unsuccessful attempts
• Updates its Q-table based on rewards
• Progress bar shows training advancement

Step 3: Best Episode Visualization

After training, the simulation automatically displays:
• The most efficient path found
• Smoothed trajectory commands
• Target detection confirmation

Step 4: Dynamic Target Repositioning

Without restarting the program:
• Close the simulation window
• Enter new target coordinates
• The drone starts from its last position
• Retraining adapts to the new target location

Quick Start

Prerequisites

Python 3.8 or higher
pip package manager
Virtual environment (recommended)

Installation

Clone the repository

git clone https://github.com/Warukho/Reinforcement-Learning-Navigating-Drone.git
cd Reinforcement-Learning-Navigating-Drone

Create a virtual environment (recommended)

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Basic Usage

# Run the main application
python ChangingTarget.py

Follow the interactive prompts as shown in the demo section above.

Documentation

Project Structure

Reinforcement_Learning_Navigating_Drone/
│
├── dronecore/              # Core drone mechanics
├── images/                 # UI assets
├── README_Data/           # Documentation assets
│
├── ChangingTarget.py      # Main application entry point
├── FunctionsLib.py        # RL algorithms & utilities
├── dronecmds.py          # Drone command interface
├── best_episode_commands.py  # Replay functionality
│
├── viewermpl.py          # Matplotlib visualizer
├── viewertk.py           # Tkinter GUI interface
├── mplext.py             # 3D plotting extensions
│
└── requirements.txt      # Project dependencies

Technical Architecture

Q-Learning Implementation

The drone learns optimal navigation strategies through:

State Space: Discretized 3D coordinates (x, y, z)
Action Space: 6 directions × variable distances
Reward System:
- +1000 for reaching target
- Proportional rewards for reducing distance
- Penalties for inefficient movements

# Q-table update formula
q_table[state][action] = old_value + α * (reward + γ * max(q_table[next_state]) - old_value)

Key Components

Dynamic State Discretization

The state space automatically adapts to room dimensions:

state_bins = [
    np.linspace(0, room_width, round(5 + (room_width ** 0.45))),
    np.linspace(0, room_depth, round(5 + (room_depth ** 0.45))),
    np.linspace(0, room_height, round(5 + (room_height ** 0.45)))
]

Trajectory Smoothing Algorithm

Click to see WITHOUT smoothing

Click to see WITH smoothing

Optimizes command sequences by:
– Aggregating movements by direction
– Canceling opposing movements
– Prioritizing larger movements
– Chunking commands to respect maximum distance constraints

Adaptive Exploration Strategy

Balances exploration vs exploitation:

# Epsilon-greedy approach with decay
epsilon = max(epsilon * epsilon_decay, epsilon_min)

Hyperparameters

Parameter	Default Value	Description
Learning Rate (α)	0.05	Controls how quickly the drone learns
Discount Factor (γ)	0.995	Importance of future rewards
Initial Exploration (ε)	0.98	Initial randomness in actions
Epsilon Decay	0.92	Rate of exploration reduction
Minimum Epsilon	0.01	Minimum exploration rate

Advanced Usage

Custom Training Configuration

Modify hyperparameters in FunctionsLib.py:

# Training parameters
alpha = 0.05        # Learning rate
gamma = 0.995       # Discount factor
epsilon = 0.98      # Initial exploration rate
epsilon_decay = 0.92
epsilon_min = 0.01

Programmatic Control

from FunctionsLib import initialize_settings, training_loop, get_training_results

# Initialize environment
settings = initialize_settings()

# Run training
best_actions, best_trajectory = training_loop(
    env_with_viewer, 
    num_episodes=100, 
    max_steps=500
)

# Generate replay commands
writing_commands(best_actions, settings["room_x"], settings["room_y"], 
                settings["room_height"], settings["drone_x"], settings["drone_y"],
                settings["target_x"], settings["target_y"], settings["target_z"])

Replay Best Episode

To replay the optimal trajectory after training:

python best_episode_commands.py

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Acknowledgments

Development Team

Axel Bouchaud--Roche

Reinforcement Learning implementation
Dynamic environment adaptation
Q-Learning algorithm optimization
Email: axelbouchaudroche@gmail.com
GitHub: AxelBcr

Pierre Chauvet

Core framework development
Drone command interface
3D visualization system
Email: pierre.chauvet@uco.fr
GitHub: pechauvet

Léo Bugyan

Co-developpment
Writing report
GitHub: zenk02

Project Status: Completed

This project was developed as part of a first-year university assignment and successfully demonstrates advanced reinforcement learning concepts applied to drone navigation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Reinforcement Learning Navigating Drone

An intelligent drone navigation system using Q-Learning to autonomously locate targets in 3D environments

Overview

Key Highlights

Features

Demo

Step-by-Step Simulation Process

Step 1: Initial Configuration

Step 2: Training Process

Step 3: Best Episode Visualization

Step 4: Dynamic Target Repositioning

Quick Start

Prerequisites

Installation

Basic Usage

Documentation

Project Structure

Technical Architecture

Q-Learning Implementation

Key Components

Hyperparameters

Advanced Usage

Custom Training Configuration

Programmatic Control

Replay Best Episode

License

Acknowledgments

Development Team

Project Status: Completed

About

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
.idea		.idea
README_Data		README_Data
dronecore		dronecore
images		images
ChangingTarget.py		ChangingTarget.py
Dossier Projet Drone.docx		Dossier Projet Drone.docx
FunctionsLib.py		FunctionsLib.py
LICENSE		LICENSE
README.md		README.md
best_episode_commands.py		best_episode_commands.py
dronecmds.py		dronecmds.py
mplext.py		mplext.py
requirements.txt		requirements.txt
viewermpl.py		viewermpl.py
viewertk.py		viewertk.py

License

AxelBcr/QDroneNav

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning Navigating Drone

An intelligent drone navigation system using Q-Learning to autonomously locate targets in 3D environments

Overview

Key Highlights

Features

Demo

Step-by-Step Simulation Process

Step 1: Initial Configuration

Step 2: Training Process

Step 3: Best Episode Visualization

Step 4: Dynamic Target Repositioning

Quick Start

Prerequisites

Installation

Basic Usage

Documentation

Project Structure

Technical Architecture

Q-Learning Implementation

Key Components

Hyperparameters

Advanced Usage

Custom Training Configuration

Programmatic Control

Replay Best Episode

License

Acknowledgments

Development Team

Project Status: Completed

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages