Skip to content

Commit 2cfd014

Browse files
authored
Refactor text2sql based on ERAG (opea-project#1080)
Signed-off-by: Yao, Qing <qing.yao@intel.com>
1 parent 90a8634 commit 2cfd014

File tree

20 files changed

+479
-421
lines changed

20 files changed

+479
-421
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
/comps/prompt_registry/ hoong.tee.yeoh@intel.com
1616
/comps/feedback_management/ hoong.tee.yeoh@intel.com
1717
/comps/chathistory/ yogesh.pandey@intel.com
18-
/comps/texttosql/ yogesh.pandey@intel.com
18+
/comps/text2sql/ yogesh.pandey@intel.com
1919
/comps/text2image/ xinyu.ye@intel.com
2020
/comps/reranks/ kaokao.lv@intel.com
2121
/comps/retrievers/ kaokao.lv@intel.com

.github/workflows/docker/compose/texttosql-compose.yaml renamed to .github/workflows/docker/compose/text2sql-compose.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
# this file should be run in the root of the repo
55
services:
6-
texttosql:
6+
text2sql:
77
build:
8-
dockerfile: comps/texttosql/langchain/Dockerfile
9-
image: ${REGISTRY:-opea}/texttosql:${TAG:-latest}
8+
dockerfile: comps/text2sql/src/Dockerfile
9+
image: ${REGISTRY:-opea}/text2sql:${TAG:-latest}

comps/cores/mega/constants.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ class ServiceType(Enum):
3333
TEXT2IMAGE = 16
3434
ANIMATION = 17
3535
IMAGE2IMAGE = 18
36+
TEXT2SQL = 19
3637

3738

3839
class MegaServiceEndpoint(Enum):

comps/text2sql/deployment/docker_compose/README.md

Whitespace-only changes.

comps/texttosql/langchain/docker_compose_texttosql.yaml renamed to comps/text2sql/deployment/docker_compose/langchain.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,9 @@ services:
3232
volumes:
3333
- ./chinook.sql:/docker-entrypoint-initdb.d/chinook.sql
3434

35-
texttosql_service:
36-
image: opea/texttosql:latest
37-
container_name: texttosql_service
35+
text2sql_service:
36+
image: opea/text2sql:latest
37+
container_name: text2sql_service
3838
ports:
3939
- "9090:8090"
4040
environment:

comps/text2sql/deployment/kubernetes/README.md

Whitespace-only changes.

comps/texttosql/langchain/Dockerfile renamed to comps/text2sql/src/Dockerfile

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,13 @@ COPY comps /home/user/comps
2121

2222
RUN pip install --no-cache-dir --upgrade pip setuptools && \
2323
if [ ${ARCH} = "cpu" ]; then \
24-
pip install --no-cache-dir --extra-index-url https://download.pytorch.org/whl/cpu -r /home/user/comps/texttosql/langchain/requirements.txt; \
24+
pip install --no-cache-dir --extra-index-url https://download.pytorch.org/whl/cpu -r /home/user/comps/text2sql/src/requirements.txt; \
2525
else \
26-
pip install --no-cache-dir -r /home/user/comps/texttosql/langchain/requirements.txt; \
26+
pip install --no-cache-dir -r /home/user/comps/text2sql/src/requirements.txt; \
2727
fi
2828

2929
ENV PYTHONPATH=$PYTHONPATH:/home/user
3030

31-
WORKDIR /home/user/comps/texttosql/langchain/
31+
WORKDIR /home/user/comps/text2sql/src/
3232

33-
ENTRYPOINT ["python", "main.py"]
33+
ENTRYPOINT ["python", "opea_text2sql_microservice.py"]

comps/text2sql/src/README.md

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
# 🛢 Text-to-SQL Microservice
2+
3+
In today's data-driven world, the ability to efficiently extract insights from databases is crucial. However, querying databases often requires specialized knowledge of SQL(Structured Query Language) and database schemas, which can be a barrier for non-technical users. This is where the Text-to-SQL microservice comes into play, leveraging the power of LLMs and agentic frameworks to bridge the gap between human language and database queries. This microservice is built on LangChain/LangGraph frameworks.
4+
5+
The microservice enables a wide range of use cases, making it a versatile tool for businesses, researchers, and individuals alike. Users can generate queries based on natural language questions, enabling them to quickly retrieve relevant data from their databases. Additionally, the service can be integrated into ChatBots, allowing for natural language interactions and providing accurate responses based on the underlying data. Furthermore, it can be utilized to build custom dashboards, enabling users to visualize and analyze insights based on their specific requirements, all through the power of natural language.
6+
7+
---
8+
9+
## 🛠️ Features
10+
11+
**Implement SQL Query based on input text**: Transform user-provided natural language into SQL queries, subsequently executing them to retrieve data from SQL databases.
12+
13+
---
14+
15+
## ⚙️ Implementation
16+
17+
The text-to-sql microservice able to implement with various framework and support various types of SQL databases.
18+
19+
### 🔗 Utilizing Text-to-SQL with Langchain framework
20+
21+
The follow guide provides set-up instructions and comprehensive details regarding the Text-to-SQL microservices via LangChain. In this configuration, we will employ PostgresDB as our example database to showcase this microservice.
22+
23+
---
24+
25+
#### 🚀 Start Microservice with Python(Option 1)
26+
27+
#### Install Requirements
28+
29+
```bash
30+
pip install -r requirements.txt
31+
```
32+
33+
#### Start PostgresDB Service
34+
35+
We will use [Chinook](https://github.com/lerocha/chinook-database) sample database as a default to test the Text-to-SQL microservice. Chinook database is a sample database ideal for demos and testing ORM tools targeting single and multiple database servers.
36+
37+
```bash
38+
export POSTGRES_USER=postgres
39+
export POSTGRES_PASSWORD=testpwd
40+
export POSTGRES_DB=chinook
41+
42+
cd comps/text2sql
43+
44+
docker run --name postgres-db --ipc=host -e POSTGRES_USER=${POSTGRES_USER} -e POSTGRES_HOST_AUTH_METHOD=trust -e POSTGRES_DB=${POSTGRES_DB} -e POSTGRES_PASSWORD=${POSTGRES_PASSWORD} -p 5442:5432 -d -v ./chinook.sql:/docker-entrypoint-initdb.d/chinook.sql postgres:latest
45+
```
46+
47+
#### Start TGI Service
48+
49+
```bash
50+
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
51+
export LLM_MODEL_ID="mistralai/Mistral-7B-Instruct-v0.3"
52+
export TGI_PORT=8008
53+
54+
docker run -d --name="text2sql-tgi-endpoint" --ipc=host -p $TGI_PORT:80 -v ./data:/data --shm-size 1g -e HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN} -e model=${LLM_MODEL_ID} ghcr.io/huggingface/text-generation-inference:2.1.0 --model-id $LLM_MODEL_ID
55+
```
56+
57+
#### Verify the TGI Service
58+
59+
```bash
60+
export your_ip=$(hostname -I | awk '{print $1}')
61+
curl http://${your_ip}:${TGI_PORT}/generate \
62+
-X POST \
63+
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
64+
-H 'Content-Type: application/json'
65+
```
66+
67+
#### Setup Environment Variables
68+
69+
```bash
70+
export TGI_LLM_ENDPOINT="http://${your_ip}:${TGI_PORT}"
71+
```
72+
73+
#### Start Text-to-SQL Microservice with Python Script
74+
75+
Start Text-to-SQL microservice with below command.
76+
77+
```bash
78+
python3 opea_text2sql_microservice.py
79+
```
80+
81+
---
82+
83+
### 🚀 Start Microservice with Docker (Option 2)
84+
85+
#### Start PostGreSQL Database Service
86+
87+
Please refer to section [Start PostgresDB Service](#start-postgresdb-service)
88+
89+
#### Start TGI Service
90+
91+
Please refer to section [Start TGI Service](#start-tgi-service)
92+
93+
#### Setup Environment Variables
94+
95+
```bash
96+
export TGI_LLM_ENDPOINT="http://${your_ip}:${TGI_PORT}"
97+
```
98+
99+
#### Build Docker Image
100+
101+
```bash
102+
cd GenAIComps/
103+
docker build -t opea/text2sql:latest -f comps/text2sql/src/Dockerfile .
104+
```
105+
106+
#### Run Docker with CLI (Option A)
107+
108+
```bash
109+
export TGI_LLM_ENDPOINT="http://${your_ip}:${TGI_PORT}"
110+
111+
docker run --runtime=runc --name="comps-langchain-text2sql" -p 9090:8080 --ipc=host -e llm_endpoint_url=${TGI_LLM_ENDPOINT} opea/text2sql:latest
112+
```
113+
114+
#### Run via docker compose (Option B)
115+
116+
- Setup Environment Variables.
117+
118+
```bash
119+
export TGI_LLM_ENDPOINT=http://${your_ip}:${TGI_PORT}
120+
export HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
121+
export LLM_MODEL_ID="mistralai/Mistral-7B-Instruct-v0.3"
122+
export POSTGRES_USER=postgres
123+
export POSTGRES_PASSWORD=testpwd
124+
export POSTGRES_DB=chinook
125+
```
126+
127+
- Start the services.
128+
129+
```bash
130+
docker compose -f docker_compose_text2sql.yaml up
131+
```
132+
133+
---
134+
135+
### ✅ Invoke the microservice.
136+
137+
The Text-to-SQL microservice exposes the following API endpoints:
138+
139+
- Test Database Connection
140+
141+
```bash
142+
curl --location http://${your_ip}:9090/v1/postgres/health \
143+
--header 'Content-Type: application/json' \
144+
--data '{"user": "'${POSTGRES_USER}'","password": "'${POSTGRES_PASSWORD}'","host": "'${your_ip}'", "port": "5442", "database": "'${POSTGRES_DB}'"}'
145+
```
146+
147+
- Execute SQL Query from input text
148+
149+
```bash
150+
curl http://${your_ip}:9090/v1/text2sql\
151+
-X POST \
152+
-d '{"input_text": "Find the total number of Albums.","conn_str": {"user": "'${POSTGRES_USER}'","password": "'${POSTGRES_PASSWORD}'","host": "'${your_ip}'", "port": "5442", "database": "'${POSTGRES_DB}'"}}' \
153+
-H 'Content-Type: application/json'
154+
```

comps/text2sql/src/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0

0 commit comments

Comments
 (0)