Skip to content

Commit 7d2c46c

Browse files
authored
Merge branch 'main' into main
2 parents 3f3289c + 3a6c7b7 commit 7d2c46c

21 files changed

+461
-7406
lines changed

.env.sample

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,3 +118,19 @@ PROMPTFLOW_RESPONSE_TIMEOUT=120
118118
PROMPTFLOW_REQUEST_FIELD_NAME=query
119119
PROMPTFLOW_RESPONSE_FIELD_NAME=reply
120120
PROMPTFLOW_CITATIONS_FIELD_NAME=documents
121+
# Chat with data: MongoDB database
122+
MONGODB_ENDPOINT=
123+
MONGODB_USERNAME=
124+
MONGODB_PASSWORD=
125+
MONGODB_DATABASE_NAME=
126+
MONGODB_COLLECTION_NAME=
127+
MONGODB_APP_NAME=
128+
MONGODB_INDEX_NAME=
129+
MONGODB_TOP_K=
130+
MONGODB_STRICTNESS=
131+
MONGODB_ENABLE_IN_DOMAIN=
132+
MONGODB_CONTENT_COLUMNS=
133+
MONGODB_FILENAME_COLUMN=
134+
MONGODB_TITLE_COLUMN=
135+
MONGODB_URL_COLUMN=
136+
MONGODB_VECTOR_COLUMNS=

README.md

Lines changed: 32 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ This repo contains sample code for a simple chat webapp that integrates with Azu
1010
- Elasticsearch index (preview)
1111
- Pinecone index (private preview)
1212
- Azure SQL Server (private preview)
13+
- Mongo DB (preview)
1314

1415
## Configure the app
1516

@@ -59,9 +60,6 @@ Please see the [section below](#add-an-identity-provider) for important informat
5960

6061
3. You can see the local running app at http://127.0.0.1:50505.
6162

62-
NOTE: You may find you need to set: MacOS: `export NODE_OPTIONS="--max-old-space-size=8192"` or Windows: `set NODE_OPTIONS=--max-old-space-size=8192` to avoid running out of memory when building the frontend.
63-
64-
6563
### Deploy with the Azure CLI
6664

6765
#### Create the Azure App Service
@@ -283,6 +281,34 @@ Note: RBAC assignments can take a few minutes before becoming effective.
283281
- `AZURE_OPENAI_EMBEDDING_NAME`: the name of your Ada (text-embedding-ada-002) model deployment on your Azure OpenAI resource.
284282
- `PINECONE_VECTOR_COLUMNS`: the vector columns in your index to use when searching. Join them with `|` like `contentVector|titleVector`.
285283

284+
#### Chat with your data using Mongo DB (Private Preview)
285+
286+
1. Update the `AZURE_OPENAI_*` environment variables as described in the [basic chat experience](#basic-chat-experience) above.
287+
288+
2. To connect to your data, you need to specify an Mongo DB database configuration. Learn more about [MongoDB](https://www.mongodb.com/).
289+
290+
3. Configure data source settings as described in the table below.
291+
292+
| App Setting | Required? | Default Value | Note |
293+
| --- | --- | --- | ------------- |
294+
|DATASOURCE_TYPE|Yes||Must be set to `MongoDB`|
295+
|MONGODB_CONNECTION_STRING|Yes||The connection string used to connect to your Mongo DB instance|
296+
|MONGODB_VECTOR_INDEX|Yes||The name of your Mongo DB vector index|
297+
|MONGODB_DATABASE_NAME|Yes||The name of your Mongo DB database|
298+
|MONGODB_CONTAINER_NAME|Yes||The name of your Mongo DB container|
299+
|MONGODB_TOP_K|No|5|The number of documents to retrieve when querying your search index.|
300+
|MONGODB_ENABLE_IN_DOMAIN|No|True|Limits responses to only queries relating to your data.|
301+
|MONGODB_STRICTNESS|No|3|Integer from 1 to 5 specifying the strictness for the model limiting responses to your data.|
302+
|MONGODB_CONTENT_COLUMNS|No||List of fields in your search index that contains the text content of your documents to use when formulating a bot response. Represent these as a string joined with "|", e.g. `"product_description|product_manual"`|
303+
|MONGODB_FILENAME_COLUMN|No|| Field from your search index that gives a unique identifier of the source of your data to display in the UI.|
304+
|MONGODB_TITLE_COLUMN|No||Field from your search index that gives a relevant title or header for your data content to display in the UI.|
305+
|MONGODB_URL_COLUMN|No||Field from your search index that contains a URL for the document, e.g. an Azure Blob Storage URI. This value is not currently used.|
306+
|MONGODB_VECTOR_COLUMNS|No||List of fields in your search index that contain vector embeddings of your documents to use when formulating a bot response. Represent these as a string joined with "|", e.g. `"product_description|product_manual"`|
307+
308+
MongoDB uses vector search by default, so ensure these settings are configured on your app:
309+
- `AZURE_OPENAI_EMBEDDING_NAME`: the name of your Ada (text-embedding-ada-002) model deployment on your Azure OpenAI resource.
310+
- `MONGODB_VECTOR_COLUMNS`: the vector columns in your index to use when searching. Join them with `|` like `contentVector|titleVector`.
311+
286312
#### Chat with your data using Azure SQL Server (Private Preview)
287313

288314
1. Update the `AZURE_OPENAI_*` environment variables as described in the [basic chat experience](#basic-chat-experience) above.
@@ -296,6 +322,9 @@ Note: RBAC assignments can take a few minutes before becoming effective.
296322
|DATASOURCE_TYPE|Yes||Must be set to `AzureSqlServer`|
297323
|AZURE_SQL_SERVER_CONNECTION_STRING|Yes||The connection string to use to connect to your Azure SQL Server instance|
298324
|AZURE_SQL_SERVER_TABLE_SCHEMA|Yes||The table schema for your Azure SQL Server table. Must be surrounded by double quotes (`"`).|
325+
|AZURE_SQL_SERVER_PORT||Not publicly available at this time.|The port to use to connect to your Azure SQL Server instance.|
326+
|AZURE_SQL_SERVER_DATABASE_NAME||Not publicly available at this time.|
327+
|AZURE_SQL_SERVER_DATABASE_SERVER||Not publicly available at this time.|
299328

300329
#### Chat with your data using Promptflow
301330

@@ -391,7 +420,6 @@ We recommend keeping these best practices in mind:
391420

392421
**A note on Azure OpenAI API versions**: The application code in this repo will implement the request and response contracts for the most recent preview API version supported for Azure OpenAI. To keep your application up-to-date as the Azure OpenAI API evolves with time, be sure to merge the latest API version update into your own application code and redeploy using the methods described in this document.
393422

394-
395423
## Contributing
396424

397425
This project welcomes contributions and suggestions. Most contributions require you to agree to a

app.py

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -218,12 +218,22 @@ def prepare_model_args(request_body, request_headers):
218218

219219
for message in request_messages:
220220
if message:
221-
messages.append(
222-
{
223-
"role": message["role"],
224-
"content": message["content"]
225-
}
226-
)
221+
if message["role"] == "assistant" and "context" in message:
222+
context_obj = json.loads(message["context"])
223+
messages.append(
224+
{
225+
"role": message["role"],
226+
"content": message["content"],
227+
"context": context_obj
228+
}
229+
)
230+
else:
231+
messages.append(
232+
{
233+
"role": message["role"],
234+
"content": message["content"]
235+
}
236+
)
227237

228238
user_json = None
229239
if (MS_DEFENDER_ENABLED):

backend/settings.py

Lines changed: 100 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -182,7 +182,7 @@ def extract_embedding_dependency(self) -> Optional[dict]:
182182
"endpoint": self.embedding_endpoint,
183183
"authentication": {
184184
"type": "api_key",
185-
"api_key": self.embedding_key
185+
"key": self.embedding_key
186186
}
187187
}
188188
else:
@@ -625,24 +625,33 @@ class _AzureSqlServerSettings(BaseSettings, DatasourcePayloadConstructor):
625625
model_config = SettingsConfigDict(
626626
env_prefix="AZURE_SQL_SERVER_",
627627
env_file=DOTENV_PATH,
628-
extra="ignore"
628+
extra="ignore",
629+
env_ignore_empty=True
629630
)
630631
_type: Literal["azure_sql_server"] = PrivateAttr(default="azure_sql_server")
631632

632-
connection_string: str = Field(exclude=True)
633-
table_schema: str
633+
connection_string: Optional[str] = Field(default=None, exclude=True)
634+
table_schema: Optional[str] = None
634635
schema_max_row: Optional[int] = None
635636
top_n_results: Optional[int] = None
637+
database_server: Optional[str] = None
638+
database_name: Optional[str] = None
639+
port: Optional[int] = None
636640

637641
# Constructed fields
638642
authentication: Optional[dict] = None
639643

640644
@model_validator(mode="after")
641645
def construct_authentication(self) -> Self:
642-
self.authentication = {
643-
"type": "connection_string",
644-
"connection_string": self.connection_string
645-
}
646+
if self.connection_string:
647+
self.authentication = {
648+
"type": "connection_string",
649+
"connection_string": self.connection_string
650+
}
651+
elif self.database_server and self.database_name and self.port:
652+
self.authentication = {
653+
"type": "system_assigned_managed_identity"
654+
}
646655
return self
647656

648657
def construct_payload_configuration(
@@ -658,7 +667,84 @@ def construct_payload_configuration(
658667
"parameters": parameters
659668
}
660669

670+
671+
class _MongoDbSettings(BaseSettings, DatasourcePayloadConstructor):
672+
model_config = SettingsConfigDict(
673+
env_prefix="MONGODB_",
674+
env_file=DOTENV_PATH,
675+
extra="ignore",
676+
env_ignore_empty=True
677+
)
678+
_type: Literal["mongo_db"] = PrivateAttr(default="mongo_db")
661679

680+
endpoint: str
681+
username: str = Field(exclude=True)
682+
password: str = Field(exclude=True)
683+
database_name: str
684+
collection_name: str
685+
app_name: str
686+
index_name: str
687+
query_type: Literal["vector"] = "vector"
688+
top_k: int = Field(default=5, serialization_alias="top_n_documents")
689+
strictness: int = 3
690+
enable_in_domain: bool = Field(default=True, serialization_alias="in_scope")
691+
content_columns: Optional[List[str]] = Field(default=None, exclude=True)
692+
vector_columns: Optional[List[str]] = Field(default=None, exclude=True)
693+
title_column: Optional[str] = Field(default=None, exclude=True)
694+
url_column: Optional[str] = Field(default=None, exclude=True)
695+
filename_column: Optional[str] = Field(default=None, exclude=True)
696+
697+
698+
# Constructed fields
699+
authentication: Optional[dict] = None
700+
embedding_dependency: Optional[dict] = None
701+
fields_mapping: Optional[dict] = None
702+
703+
@field_validator('content_columns', 'vector_columns', mode="before")
704+
@classmethod
705+
def split_columns(cls, comma_separated_string: str) -> List[str]:
706+
if isinstance(comma_separated_string, str) and len(comma_separated_string) > 0:
707+
return parse_multi_columns(comma_separated_string)
708+
709+
return None
710+
711+
@model_validator(mode="after")
712+
def set_fields_mapping(self) -> Self:
713+
self.fields_mapping = {
714+
"content_fields": self.content_columns,
715+
"title_field": self.title_column,
716+
"url_field": self.url_column,
717+
"filepath_field": self.filename_column,
718+
"vector_fields": self.vector_columns
719+
}
720+
return self
721+
722+
@model_validator(mode="after")
723+
def construct_authentication(self) -> Self:
724+
self.authentication = {
725+
"type": "username_and_password",
726+
"username": self.username,
727+
"password": self.password
728+
}
729+
return self
730+
731+
def construct_payload_configuration(
732+
self,
733+
*args,
734+
**kwargs
735+
):
736+
self.embedding_dependency = \
737+
self._settings.azure_openai.extract_embedding_dependency()
738+
739+
parameters = self.model_dump(exclude_none=True, by_alias=True)
740+
parameters.update(self._settings.search.model_dump(exclude_none=True, by_alias=True))
741+
742+
return {
743+
"type": self._type,
744+
"parameters": parameters
745+
}
746+
747+
662748
class _BaseSettings(BaseSettings):
663749
model_config = SettingsConfigDict(
664750
env_file=DOTENV_PATH,
@@ -729,15 +815,20 @@ def set_datasource_settings(self) -> Self:
729815
elif self.base_settings.datasource_type == "AzureSqlServer":
730816
self.datasource = _AzureSqlServerSettings(settings=self, _env_file=DOTENV_PATH)
731817
logging.debug("Using SQL Server")
818+
819+
elif self.base_settings.datasource_type == "MongoDB":
820+
self.datasource = _MongoDbSettings(settings=self, _env_file=DOTENV_PATH)
821+
logging.debug("Using Mongo DB")
732822

733823
else:
734824
self.datasource = None
735825
logging.warning("No datasource configuration found in the environment -- calls will be made to Azure OpenAI without grounding data.")
736826

737827
return self
738828

739-
except ValidationError:
829+
except ValidationError as e:
740830
logging.warning("No datasource configuration found in the environment -- calls will be made to Azure OpenAI without grounding data.")
831+
logging.warning(e.errors())
741832

742833

743834
app_settings = _AppSettings()

0 commit comments

Comments
 (0)