Skip to content

Drone Project Using Q-Learning : Helping a Drone find a target. Core framework is made to be used with Tello drones, so this algorithm may be compatible with them.

License

Notifications You must be signed in to change notification settings

AxelBcr/QDroneNav

Repository files navigation

Reinforcement Learning Navigating Drone

Python License Reinforcement Learning Status

An intelligent drone navigation system using Q-Learning to autonomously locate targets in 3D environments

FeaturesDemoInstallationUsageDocumentation

Drone Navigation Demo

Overview

This project implements an advanced Q-Learning algorithm to train a virtual drone for autonomous navigation in customizable 3D environments. Originally developed as an innovative solution to a university assignment, it showcases the power of reinforcement learning in robotics applications.

For more details please read Project Report

Key Highlights

  • Reinforcement Learning: Implements Q-Learning with customizable hyperparameters
  • Real-time 3D Visualization: Interactive simulation with matplotlib and tkinter
  • Dynamic Retraining: Adapt to new targets without restarting
  • Optimized Trajectories: Intelligent path smoothing for efficient navigation
  • Performance Monitoring: Track training progress and replay best episodes

Features

  • Room Simulation: Customizable 3D room environment for drone navigation
  • Target Detection: Intelligent algorithm to locate a target in the simulated room
  • Reinforcement Learning: Implements Q-Learning for trajectory optimization
  • Visualization: Real-time 3D trajectory plotting for training and performance monitoring
  • Dynamic Updates: Allows reconfiguration of the target's location with retraining capabilities
  • Replay Mechanism: Replays the best navigation trajectory using generated commands

Demo

Step-by-Step Simulation Process

Step 1: Initial Configuration



When you launch ChangingTarget.py, you'll be prompted to configure:
• Room dimensions (depth, width, height)
• Target position (x, y, z coordinates)
• Drone starting position
• Number of training episodes
• Maximum steps per episode

Step 2: Training Process



The Q-Learning algorithm trains the drone through multiple episodes:
• The drone explores the environment
• Learns from successful and unsuccessful attempts
• Updates its Q-table based on rewards
• Progress bar shows training advancement

Step 3: Best Episode Visualization



After training, the simulation automatically displays:
• The most efficient path found
• Smoothed trajectory commands
• Target detection confirmation

Step 4: Dynamic Target Repositioning



Without restarting the program:
• Close the simulation window
• Enter new target coordinates
• The drone starts from its last position
• Retraining adapts to the new target location

Quick Start

Prerequisites

  • Python 3.8 or higher
  • pip package manager
  • Virtual environment (recommended)

Installation

  1. Clone the repository
git clone https://github.com/Warukho/Reinforcement-Learning-Navigating-Drone.git
cd Reinforcement-Learning-Navigating-Drone
  1. Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies
pip install -r requirements.txt

Basic Usage

# Run the main application
python ChangingTarget.py

Follow the interactive prompts as shown in the demo section above.

Documentation

Project Structure

Reinforcement_Learning_Navigating_Drone/
│
├── dronecore/              # Core drone mechanics
├── images/                 # UI assets
├── README_Data/           # Documentation assets
│
├── ChangingTarget.py      # Main application entry point
├── FunctionsLib.py        # RL algorithms & utilities
├── dronecmds.py          # Drone command interface
├── best_episode_commands.py  # Replay functionality
│
├── viewermpl.py          # Matplotlib visualizer
├── viewertk.py           # Tkinter GUI interface
├── mplext.py             # 3D plotting extensions
│
└── requirements.txt      # Project dependencies

Technical Architecture

Q-Learning Implementation

The drone learns optimal navigation strategies through:

  • State Space: Discretized 3D coordinates (x, y, z)
  • Action Space: 6 directions × variable distances
  • Reward System:
    • +1000 for reaching target
    • Proportional rewards for reducing distance
    • Penalties for inefficient movements
# Q-table update formula
q_table[state][action] = old_value + α * (reward + γ * max(q_table[next_state]) - old_value)

Key Components

Dynamic State Discretization

The state space automatically adapts to room dimensions:

state_bins = [
    np.linspace(0, room_width, round(5 + (room_width ** 0.45))),
    np.linspace(0, room_depth, round(5 + (room_depth ** 0.45))),
    np.linspace(0, room_height, round(5 + (room_height ** 0.45)))
]

Trajectory Smoothing Algorithm

Click to see WITHOUT smoothing


Click to see WITH smoothing


Optimizes command sequences by:
– Aggregating movements by direction
– Canceling opposing movements
– Prioritizing larger movements
– Chunking commands to respect maximum distance constraints


Adaptive Exploration Strategy

Balances exploration vs exploitation:

# Epsilon-greedy approach with decay
epsilon = max(epsilon * epsilon_decay, epsilon_min)

Hyperparameters

Parameter Default Value Description
Learning Rate (α) 0.05 Controls how quickly the drone learns
Discount Factor (γ) 0.995 Importance of future rewards
Initial Exploration (ε) 0.98 Initial randomness in actions
Epsilon Decay 0.92 Rate of exploration reduction
Minimum Epsilon 0.01 Minimum exploration rate

Advanced Usage

Custom Training Configuration

Modify hyperparameters in FunctionsLib.py:

# Training parameters
alpha = 0.05        # Learning rate
gamma = 0.995       # Discount factor
epsilon = 0.98      # Initial exploration rate
epsilon_decay = 0.92
epsilon_min = 0.01

Programmatic Control

from FunctionsLib import initialize_settings, training_loop, get_training_results

# Initialize environment
settings = initialize_settings()

# Run training
best_actions, best_trajectory = training_loop(
    env_with_viewer, 
    num_episodes=100, 
    max_steps=500
)

# Generate replay commands
writing_commands(best_actions, settings["room_x"], settings["room_y"], 
                settings["room_height"], settings["drone_x"], settings["drone_y"],
                settings["target_x"], settings["target_y"], settings["target_z"])

Replay Best Episode

To replay the optimal trajectory after training:

python best_episode_commands.py

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Acknowledgments

Development Team

Axel Bouchaud--Roche

Pierre Chauvet

Léo Bugyan

  • Co-developpment
  • Writing report
  • GitHub: zenk02

Project Status: Completed

This project was developed as part of a first-year university assignment and successfully demonstrates advanced reinforcement learning concepts applied to drone navigation.

About

Drone Project Using Q-Learning : Helping a Drone find a target. Core framework is made to be used with Tello drones, so this algorithm may be compatible with them.

Topics

Resources

License

Stars

Watchers

Forks

Languages