Skip to content

Commit 8c4d02d

Browse files
authored
v2.6
1 parent cb7b7a3 commit 8c4d02d

File tree

3 files changed

+21
-8
lines changed

3 files changed

+21
-8
lines changed

src/User_Manual/config.yaml

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,18 @@ AVAILABLE_MODELS:
2626
- jinaai/jina-embeddings-v2-base-en
2727
- jinaai/jina-embeddings-v2-small-en
2828
COMPUTE_DEVICE: cpu
29+
Compute_Device:
30+
available:
31+
- cuda
32+
- cpu
33+
database_creation: cpu
34+
database_query: cpu
2935
EMBEDDING_MODEL_NAME:
30-
chunk_overlap: 200
31-
chunk_size: 600
3236
database:
33-
contexts: 15
37+
chunk_overlap: 200
38+
chunk_size: 750
39+
contexts: 10
40+
device: null
3441
similarity: 0.9
3542
embedding-models:
3643
bge:
@@ -41,7 +48,7 @@ embedding-models:
4148
server:
4249
api_key: ''
4350
connection_str: http://localhost:1234/v1
44-
model_max_tokens: 512
51+
model_max_tokens: -1
4552
model_temperature: 0.1
4653
prefix: '[INST]'
4754
suffix: '[/INST]'
@@ -53,5 +60,5 @@ styles:
5360
text: 'background-color: #092327; color: light gray; font: 12pt "Segoe UI Historic";'
5461
transcriber:
5562
device: cpu
56-
model: base.en
57-
quant: float32
63+
model: small.en
64+
quant: int8

src/User_Manual/settings.html

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,8 @@ <h3>Prefix and Suffix</h3>
116116
<h2>Embedding Models Settings</h2>
117117
<p>These settings apply only if you're using a model named <code>BGE</code> or <code>Instructor</code>. Tread carefully
118118
when adjusting these settings because it could hinder performance. You can search online on how to adjust these depending
119-
on the type of text being entered into the vector database.</p>
119+
on the type of text being entered into the vector database. Also, if you change the chunk size or overlap settings you must
120+
recreate the vector database for the changes to take effect.</p>
120121

121122
<p>All other types of embedding models that my program uses don't require specialized settings.</p>
122123

src/User_Manual/tips.html

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,12 @@ <h2 style="color: #f0f0f0;" align="center">Manage VRAM</h2>
102102
database and a Ctranslate2 Whisper model for the transcription functionality. Therefore, it is important manage your
103103
memory to achieve the best performance.</p>
104104

105+
<p>If you are tight on VRAM, do not have LM Studio running while the vector database is being created. I highly
106+
recommend that you choose "cpu" for when querying the database and only use "cuda" or "mps" when creating the database.
107+
Creating the database takes 5,000 times more compute power and it's worth using gpu-acceleration (and hence VRAM).
108+
However, merely querying the database can be done on any CPU easily and when you select "CPU" you're using system RAM
109+
and not VRAM. The option to choose difference compute devices for database creation versus querying is a recent addition.
110+
105111
<p>To save VRAM, unplug any secondary monitors from the GPU and plug them into graphics ports (e.g. HDMI or DisplayPort)
106112
coming directly from your motherboard. This will prevent these monitors from using your GPU. You will most likely want
107113
to keep your main monitor plugged in (e.g. for gaming).</p>
@@ -121,7 +127,6 @@ <h2 style="color: #f0f0f0;" align="center">Select an Appropriate Embedding Model
121127
<p><b>https://instructor-embedding.github.io/</b></p>
122128
<p><b>https://github.com/FlagOpen/FlagEmbedding</b></p>
123129
<p><b>https://huggingface.co/thenlper/gte-large</b></p>
124-
<p><b>https://huggingface.co/intfloat/multilingual-e5-large</b></p>
125130
<p><b>https://huggingface.co/jinaai/jina-embedding-l-en-v1</b></p>
126131

127132
<h2 style="color: #f0f0f0;" align="center">Select the Appropriate Model Within LM Studio</h2>

0 commit comments

Comments
 (0)