File tree Expand file tree Collapse file tree 3 files changed +21
-8
lines changed Expand file tree Collapse file tree 3 files changed +21
-8
lines changed Original file line number Diff line number Diff line change @@ -26,11 +26,18 @@ AVAILABLE_MODELS:
26
26
- jinaai/jina-embeddings-v2-base-en
27
27
- jinaai/jina-embeddings-v2-small-en
28
28
COMPUTE_DEVICE : cpu
29
+ Compute_Device :
30
+ available :
31
+ - cuda
32
+ - cpu
33
+ database_creation : cpu
34
+ database_query : cpu
29
35
EMBEDDING_MODEL_NAME :
30
- chunk_overlap : 200
31
- chunk_size : 600
32
36
database :
33
- contexts : 15
37
+ chunk_overlap : 200
38
+ chunk_size : 750
39
+ contexts : 10
40
+ device : null
34
41
similarity : 0.9
35
42
embedding-models :
36
43
bge :
@@ -41,7 +48,7 @@ embedding-models:
41
48
server :
42
49
api_key : ' '
43
50
connection_str : http://localhost:1234/v1
44
- model_max_tokens : 512
51
+ model_max_tokens : -1
45
52
model_temperature : 0.1
46
53
prefix : ' [INST]'
47
54
suffix : ' [/INST]'
@@ -53,5 +60,5 @@ styles:
53
60
text : ' background-color: #092327; color: light gray; font: 12pt "Segoe UI Historic";'
54
61
transcriber :
55
62
device : cpu
56
- model : base .en
57
- quant : float32
63
+ model : small .en
64
+ quant : int8
Original file line number Diff line number Diff line change @@ -116,7 +116,8 @@ <h3>Prefix and Suffix</h3>
116
116
< h2 > Embedding Models Settings</ h2 >
117
117
< p > These settings apply only if you're using a model named < code > BGE</ code > or < code > Instructor</ code > . Tread carefully
118
118
when adjusting these settings because it could hinder performance. You can search online on how to adjust these depending
119
- on the type of text being entered into the vector database.</ p >
119
+ on the type of text being entered into the vector database. Also, if you change the chunk size or overlap settings you must
120
+ recreate the vector database for the changes to take effect.</ p >
120
121
121
122
< p > All other types of embedding models that my program uses don't require specialized settings.</ p >
122
123
Original file line number Diff line number Diff line change @@ -102,6 +102,12 @@ <h2 style="color: #f0f0f0;" align="center">Manage VRAM</h2>
102
102
database and a Ctranslate2 Whisper model for the transcription functionality. Therefore, it is important manage your
103
103
memory to achieve the best performance.</ p >
104
104
105
+ < p > If you are tight on VRAM, do not have LM Studio running while the vector database is being created. I highly
106
+ recommend that you choose "cpu" for when querying the database and only use "cuda" or "mps" when creating the database.
107
+ Creating the database takes 5,000 times more compute power and it's worth using gpu-acceleration (and hence VRAM).
108
+ However, merely querying the database can be done on any CPU easily and when you select "CPU" you're using system RAM
109
+ and not VRAM. The option to choose difference compute devices for database creation versus querying is a recent addition.
110
+
105
111
< p > To save VRAM, unplug any secondary monitors from the GPU and plug them into graphics ports (e.g. HDMI or DisplayPort)
106
112
coming directly from your motherboard. This will prevent these monitors from using your GPU. You will most likely want
107
113
to keep your main monitor plugged in (e.g. for gaming).</ p >
@@ -121,7 +127,6 @@ <h2 style="color: #f0f0f0;" align="center">Select an Appropriate Embedding Model
121
127
< p > < b > https://instructor-embedding.github.io/</ b > </ p >
122
128
< p > < b > https://github.com/FlagOpen/FlagEmbedding</ b > </ p >
123
129
< p > < b > https://huggingface.co/thenlper/gte-large</ b > </ p >
124
- < p > < b > https://huggingface.co/intfloat/multilingual-e5-large</ b > </ p >
125
130
< p > < b > https://huggingface.co/jinaai/jina-embedding-l-en-v1</ b > </ p >
126
131
127
132
< h2 style ="color: #f0f0f0; " align ="center "> Select the Appropriate Model Within LM Studio</ h2 >
You can’t perform that action at this time.
0 commit comments