Skip to content

Commit 0b4b919

Browse files
authored
feat: add reranking (#20)
1 parent 81de1ff commit 0b4b919

File tree

17 files changed

+471
-148
lines changed

17 files changed

+471
-148
lines changed

.cruft.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"template": "https://github.com/superlinear-ai/poetry-cookiecutter",
3-
"commit": "a969f1d182ec39d7d27ccb1116cf60ba736adcfa",
3+
"commit": "b7f2fb0f123aae0a01d2ab015db31f52d2d8cc21",
44
"checkout": null,
55
"context": {
66
"cookiecutter": {
@@ -26,4 +26,4 @@
2626
}
2727
},
2828
"directory": null
29-
}
29+
}

.devcontainer/devcontainer.json

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,9 @@
3838
100
3939
],
4040
"files.autoSave": "onFocusChange",
41-
"jupyter.kernels.excludePythonEnvironments": ["/usr/local/bin/python"],
41+
"jupyter.kernels.excludePythonEnvironments": [
42+
"/usr/local/bin/python"
43+
],
4244
"mypy-type-checker.importStrategy": "fromEnvironment",
4345
"mypy-type-checker.preferDaemon": true,
4446
"notebook.codeActionsOnSave": {
@@ -50,7 +52,7 @@
5052
"python.terminal.activateEnvironment": false,
5153
"python.testing.pytestEnabled": true,
5254
"ruff.importStrategy": "fromEnvironment",
53-
"ruff.logLevel": "warn",
55+
"ruff.logLevel": "warning",
5456
"terminal.integrated.defaultProfile.linux": "zsh",
5557
"terminal.integrated.profiles.linux": {
5658
"zsh": {

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ data/
1919
# dotenv
2020
.env
2121

22+
# Rerankers
23+
.*_cache/
24+
2225
# Hypothesis
2326
.hypothesis/
2427

Dockerfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ RUN --mount=type=cache,target=/var/cache/apt/ \
7070
sh -c "$(curl -fsSL https://starship.rs/install.sh)" -- "--yes" && \
7171
usermod --shell /usr/bin/zsh user && \
7272
echo 'user ALL=(root) NOPASSWD:ALL' > /etc/sudoers.d/user && chmod 0440 /etc/sudoers.d/user
73+
RUN git config --system --add safe.directory '*'
7374
USER user
7475

7576
# Install the development Python dependencies in the virtual environment.

README.md

Lines changed: 69 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,36 @@
1-
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/superlinear-ai/raglite) [![Open in GitHub Codespaces](https://img.shields.io/static/v1?label=GitHub%20Codespaces&message=Open&color=blue&logo=github)](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=812973394&skip_quickstart=true)
1+
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/superlinear-ai/raglite) [![Open in GitHub Codespaces](https://img.shields.io/static/v1?label=GitHub%20Codespaces&message=Open&color=blue&logo=github)](https://github.com/codespaces/new/superlinear-ai/raglite)
22

33
# 🥤 RAGLite
44

55
RAGLite is a Python package for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite.
66

77
## Features
88

9-
1. ❤️ Only lightweight and permissive open source dependencies (e.g., no [PyTorch](https://github.com/pytorch/pytorch), [LangChain](https://github.com/langchain-ai/langchain), or [PyMuPDF](https://github.com/pymupdf/PyMuPDF))
10-
2. 🧠 Choose any LLM provider with [LiteLLM](https://github.com/BerriAI/litellm), including local [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) models
11-
3. 💾 Either [PostgreSQL](https://github.com/postgres/postgres) or [SQLite](https://github.com/sqlite/sqlite) as a keyword & vector search database
12-
4. 🚀 Acceleration with Metal on macOS, and CUDA on Linux and Windows
13-
5. 📖 PDF to Markdown conversion on top of [pdftext](https://github.com/VikParuchuri/pdftext) and [pypdfium2](https://github.com/pypdfium2-team/pypdfium2)
14-
6. 🧬 Multi-vector chunk embedding with [late chunking](https://weaviate.io/blog/late-chunking) and [contextual chunk headings](https://d-star.ai/solving-the-out-of-context-chunk-problem-for-rag)
15-
7. ✂️ Optimal [level 4 semantic chunking](https://medium.com/@anuragmishra_27746/five-levels-of-chunking-strategies-in-rag-notes-from-gregs-video-7b735895694d) by solving a [binary integer programming problem](https://en.wikipedia.org/wiki/Integer_programming)
16-
8. 🌀 Optimal [closed-form linear query adapter](src/raglite/_query_adapter.py) by solving an [orthogonal Procrustes problem](https://en.wikipedia.org/wiki/Orthogonal_Procrustes_problem)
17-
9. 🔍 [Hybrid search](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) that combines the database's built-in keyword search ([tsvector](https://www.postgresql.org/docs/current/datatype-textsearch.html) in PostgreSQL, [FTS5](https://www.sqlite.org/fts5.html) in SQLite) with their native vector search extensions ([pgvector](https://github.com/pgvector/pgvector) in PostgreSQL, [sqlite-vec](https://github.com/asg017/sqlite-vec) in SQLite)
18-
10. ✍️ Optional: conversion of any input document to Markdown with [Pandoc](https://github.com/jgm/pandoc)
19-
11. ✅ Optional: evaluation of retrieval and generation performance with [Ragas](https://github.com/explodinggradients/ragas)
9+
##### Configurable
10+
11+
- 🧠 Choose any LLM provider with [LiteLLM](https://github.com/BerriAI/litellm), including local [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) models
12+
- 💾 Choose either [PostgreSQL](https://github.com/postgres/postgres) or [SQLite](https://github.com/sqlite/sqlite) as a keyword & vector search database
13+
- 🥇 Choose any reranker with [rerankers](https://github.com/AnswerDotAI/rerankers), including multi-lingual [FlashRank](https://github.com/PrithivirajDamodaran/FlashRank) as the default
14+
15+
##### Fast and permissive
16+
17+
- ❤️ Only lightweight and permissive open source dependencies (e.g., no [PyTorch](https://github.com/pytorch/pytorch) or [LangChain](https://github.com/langchain-ai/langchain))
18+
- 🚀 Acceleration with Metal on macOS, and CUDA on Linux and Windows
19+
20+
##### Unhobbled
21+
22+
- 📖 PDF to Markdown conversion on top of [pdftext](https://github.com/VikParuchuri/pdftext) and [pypdfium2](https://github.com/pypdfium2-team/pypdfium2)
23+
- 🧬 Multi-vector chunk embedding with [late chunking](https://weaviate.io/blog/late-chunking) and [contextual chunk headings](https://d-star.ai/solving-the-out-of-context-chunk-problem-for-rag)
24+
- ✂️ Optimal [level 4 semantic chunking](https://medium.com/@anuragmishra_27746/five-levels-of-chunking-strategies-in-rag-notes-from-gregs-video-7b735895694d) by solving a [binary integer programming problem](https://en.wikipedia.org/wiki/Integer_programming)
25+
- 🔍 [Hybrid search](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) with the database's native keyword & vector search ([tsvector](https://www.postgresql.org/docs/current/datatype-textsearch.html)+[pgvector](https://github.com/pgvector/pgvector), [FTS5](https://www.sqlite.org/fts5.html)+[sqlite-vec](https://github.com/asg017/sqlite-vec)[^1])
26+
- 🌀 Optimal [closed-form linear query adapter](src/raglite/_query_adapter.py) by solving an [orthogonal Procrustes problem](https://en.wikipedia.org/wiki/Orthogonal_Procrustes_problem)
27+
28+
##### Extensible
29+
30+
- ✍️ Optional conversion of any input document to Markdown with [Pandoc](https://github.com/jgm/pandoc)
31+
- ✅ Optional evaluation of retrieval and generation performance with [Ragas](https://github.com/explodinggradients/ragas)
32+
33+
[^1]: We use [PyNNDescent](https://github.com/lmcinnes/pynndescent) until [sqlite-vec](https://github.com/asg017/sqlite-vec) is more mature.
2034

2135
## Installing
2236

@@ -57,10 +71,10 @@ pip install raglite[ragas]
5771
### 1. Configuring RAGLite
5872

5973
> [!TIP]
60-
> 🧠 RAGLite extends [LiteLLM](https://github.com/BerriAI/litellm) with support for [llama.cpp](https://github.com/ggerganov/llama.cpp) models using [llama-cpp-python](https://github.com/abetlen/llama-cpp-python). To select a llama.cpp model (e.g., from [bartowski's collection](https://huggingface.co/collections/bartowski/recent-highlights-65cf8e08f8ab7fc669d7b5bd)), use a model identifier of the form `"llama-cpp-python/<hugging_face_repo_id>/<filename>@<n_ctx>"`, where `n_ctx` is an optional parameter that specifies the context size of the model.
74+
> 🧠 RAGLite extends [LiteLLM](https://github.com/BerriAI/litellm) with support for [llama.cpp](https://github.com/ggerganov/llama.cpp) models using [llama-cpp-python](https://github.com/abetlen/llama-cpp-python). To select a llama.cpp model (e.g., from [bartowski's collection](https://huggingface.co/bartowski)), use a model identifier of the form `"llama-cpp-python/<hugging_face_repo_id>/<filename>@<n_ctx>"`, where `n_ctx` is an optional parameter that specifies the context size of the model.
6175
6276
> [!TIP]
63-
> 💾 You can create a PostgreSQL database for free in a few clicks at [neon.tech](https://neon.tech) (not sponsored).
77+
> 💾 You can create a PostgreSQL database in a few clicks at [neon.tech](https://neon.tech).
6478
6579
First, configure RAGLite with your preferred PostgreSQL or SQLite database and [any LLM supported by LiteLLM](https://docs.litellm.ai/docs/providers/openai):
6680

@@ -82,6 +96,27 @@ my_config = RAGLiteConfig(
8296
)
8397
```
8498

99+
You can also configure [any reranker supported by rerankers](https://github.com/AnswerDotAI/rerankers):
100+
101+
```python
102+
from rerankers import Reranker
103+
104+
# Example remote API-based reranker:
105+
my_config = RAGLiteConfig(
106+
db_url="postgresql://my_username:my_password@my_host:5432/my_database"
107+
reranker=Reranker("cohere", lang="en", api_key=COHERE_API_KEY)
108+
)
109+
110+
# Example local cross-encoder reranker per language (this is the default):
111+
my_config = RAGLiteConfig(
112+
db_url="sqlite:///raglite.sqlite",
113+
reranker=(
114+
("en", Reranker("ms-marco-MiniLM-L-12-v2", model_type="flashrank")), # English
115+
("other", Reranker("ms-marco-MultiBERT-L-12", model_type="flashrank")), # Other languages
116+
)
117+
)
118+
```
119+
85120
### 2. Inserting documents
86121

87122
> [!TIP]
@@ -100,24 +135,38 @@ insert_document(Path("Special Relativity.pdf"), config=my_config)
100135

101136
### 3. Searching and Retrieval-Augmented Generation (RAG)
102137

103-
Now, you can search for chunks with keyword search, vector search, or a hybrid of the two. You can also answer questions with RAG and the search method of your choice (`hybrid` is the default):
138+
Now, you can search for chunks with vector search, keyword search, or a hybrid of the two. You can also rerank the search results with the configured reranker. And you can use any search method of your choice (`hybrid_search` is the default) together with reranking to answer questions with RAG:
104139

105140
```python
106141
# Search for chunks:
107142
from raglite import hybrid_search, keyword_search, vector_search
108143

109144
prompt = "How is intelligence measured?"
110-
results_vector = vector_search(prompt, num_results=5, config=my_config)
111-
results_keyword = keyword_search(prompt, num_results=5, config=my_config)
112-
results_hybrid = hybrid_search(prompt, num_results=5, config=my_config)
145+
chunk_ids_vector, _ = vector_search(prompt, num_results=20, config=my_config)
146+
chunk_ids_keyword, _ = keyword_search(prompt, num_results=20, config=my_config)
147+
chunk_ids_hybrid, _ = hybrid_search(prompt, num_results=20, config=my_config)
148+
149+
# Retrieve chunks:
150+
from raglite import retrieve_chunks
151+
152+
chunks_hybrid = retrieve_chunks(chunk_ids_hybrid, config=my_config)
153+
154+
# Rerank chunks:
155+
from raglite import rerank
156+
157+
chunks_reranked = rerank(prompt, chunks_hybrid, config=my_config)
113158

114159
# Answer questions with RAG:
115160
from raglite import rag
116161

117162
prompt = "What does it mean for two events to be simultaneous?"
118-
stream = rag(prompt, search=hybrid_search, config=my_config)
163+
stream = rag(prompt, config=my_config)
119164
for update in stream:
120165
print(update, end="")
166+
167+
# You can also pass a search method or search results directly:
168+
stream = rag(prompt, search=hybrid_search, config=my_config)
169+
stream = rag(prompt, search=chunks_reranked, config=my_config)
121170
```
122171

123172
### 4. Computing and using an optimal query adapter
@@ -129,7 +178,7 @@ RAGLite can compute and apply an [optimal closed-form query adapter](src/raglite
129178
from raglite import insert_evals, update_query_adapter
130179

131180
insert_evals(num_evals=100, config=my_config)
132-
update_query_adapter(config=my_config)
181+
update_query_adapter(config=my_config) # From here, simply call vector_search to use the query adapter.
133182
```
134183

135184
### 5. Evaluation of retrieval and generation

0 commit comments

Comments
 (0)