"For LLMs, reasoning is always better than no reasoning... Aggregating multiple answers is better than one answer... Retrieval plus reasoning is better than reasoning only." - Denny Zhou, Stanford CS25
This project is a practical implementation of the key techniques for enhancing Large Language Model (LLM) reasoning, as presented in Denny Zhou's talk at Stanford's CS25. It translates theoretical concepts into a working Python application, demonstrating a structured approach to building more reliable and verifiable AI systems.
This repository explores and implements a pipeline of advanced reasoning techniques:
- Chain-of-Thought (CoT) Prompting: Moving beyond simple queries to encourage models to "think step by step," improving their accuracy on multi-step problems.
- Robust Answer Parsing: A test-driven parser to reliably extract final answers from an LLM's verbose, natural language output.
- Self-Consistency: A powerful technique to improve accuracy by generating multiple diverse reasoning paths and selecting the most frequent answer (majority vote).
- Self-Improvement Data Generation: A pipeline that simulates Reinforcement Learning from AI Feedback (RLAIF) by using a verifier to filter for high-quality, correct reasoning paths, which can then be used for fine-tuning.
- Dockerization: The entire application is containerized with Docker, ensuring a portable, reproducible, and easy-to-run environment.
For a deeper dive into the architecture, including Class and Sequence diagrams, please see the Architectural Design Document.
You can run this project either locally with a Python virtual environment or using Docker (recommended).
- Python 3.12+
- Docker Desktop (for the containerized approach)
- An OpenAI API Key
git clone https://github.com/zhu-weijie/cognition-synthesis.git
cd cognition-synthesis
Create a .env
file in the project root by copying the example:
cp .env.example .env
Now, edit the .env
file and add your OpenAI API key:
OPENAI_API_KEY="your-api-key-goes-here"
This is the simplest and most reliable way to run the project.
1. Build the Docker image:
docker build -t cognition-synthesis .
2. Run the application:
The command below runs the full demonstration and uses a volume (-v
) to save the generated training_data.jsonl
file to your local directory.
docker run --rm --env-file .env -v "$(pwd):/app" cognition-synthesis
1. Create and activate a virtual environment:
# Create the environment
python3 -m venv venv
# Activate it (on macOS/Linux)
source venv/bin/activate
2. Install dependencies:
pip install -r requirements.txt
3. Run the application:
python main.py
Executing main.py
(either locally or via Docker) will run a full demonstration of all the implemented techniques in sequence:
- Basic & Chain-of-Thought Tasks: Demonstrates the difference between direct queries and CoT prompting.
- Self-Consistency Task: Shows how majority voting over multiple reasoning paths can correct errors and improve reliability.
- Data Generation Pipeline: Simulates a self-improvement loop by:
- Taking problems with known answers from a
ProblemBank
. - Generating 8 diverse reasoning paths for each problem.
- Using a
Verifier
to check which paths lead to the correct answer. - Saving the correct
(problem, reasoning_path)
pairs totraining_data.jsonl
.
- Taking problems with known answers from a
The final output is a high-quality, AI-generated dataset ready for fine-tuning.
cognition-synthesis/
โโโ .dockerignore # Excludes files from the Docker image
โโโ .env # Stores your API key (gitignored)
โโโ .env.example # An example environment file
โโโ Dockerfile # Blueprint for the Docker container
โโโ main.py # Main entry point for the application
โโโ requirements.txt # Project dependencies
โโโ cognition_synthesis/ # Main application source code
โ โโโ llm/ # LLM client wrapper
โ โโโ parsing/ # Answer parsing logic
โ โโโ pipelines/ # Data generation pipeline orchestrator
โ โโโ prompts/ # Prompt management and formatting
โ โโโ reasoning/ # Core reasoning techniques (e.g., SelfConsistency)
โ โโโ verification/ # Verifier and ProblemBank
โโโ docs/
โ โโโ design.md # Detailed architectural diagrams
โโโ tests/ # Unit tests for the project