File tree Expand file tree Collapse file tree 4 files changed +37
-7
lines changed Expand file tree Collapse file tree 4 files changed +37
-7
lines changed Original file line number Diff line number Diff line change @@ -25,10 +25,13 @@ AVAILABLE_MODELS:
25
25
- jinaai/jina-embedding-t-en-v1
26
26
- jinaai/jina-embeddings-v2-base-en
27
27
- jinaai/jina-embeddings-v2-small-en
28
- COMPUTE_DEVICE : cuda
29
- EMBEDDING_MODEL_NAME : null
30
- chunk_overlap : 250
31
- chunk_size : 1500
28
+ COMPUTE_DEVICE : cpu
29
+ EMBEDDING_MODEL_NAME :
30
+ chunk_overlap : 200
31
+ chunk_size : 600
32
+ database :
33
+ contexts : 15
34
+ similarity : 0.9
32
35
embedding-models :
33
36
bge :
34
37
query_instruction : ' Represent this sentence for searching relevant passages:'
@@ -38,7 +41,7 @@ embedding-models:
38
41
server :
39
42
api_key : ' '
40
43
connection_str : http://localhost:1234/v1
41
- model_max_tokens : -1
44
+ model_max_tokens : 512
42
45
model_temperature : 0.1
43
46
prefix : ' [INST]'
44
47
suffix : ' [/INST]'
@@ -48,3 +51,7 @@ styles:
48
51
frame : ' background-color: #161b22;'
49
52
input : ' background-color: #2e333b; color: light gray; font: 13pt "Segoe UI Historic";'
50
53
text : ' background-color: #092327; color: light gray; font: 12pt "Segoe UI Historic";'
54
+ transcriber :
55
+ device : cpu
56
+ model : base.en
57
+ quant : float32
Original file line number Diff line number Diff line change @@ -156,6 +156,16 @@ <h3>Chunk Overlap</h3>
156
156
it will automatically include, for example, the last 250 characters of the prior chunk. Feel free to experiment
157
157
with this setting as well to get the best results!</ p >
158
158
159
+ < h2 > Database Settings</ h2 >
160
+ < p > The < code > Similarity</ code > setting determines how similiar to your question the results from the database must be in
161
+ order for them to be considered to be sent to the LLM as "context." The closer the value to < code > 1</ code > the more
162
+ similar it must be, with a value of < code > 1</ code > meaning a verbatim match to your query. It's generally advised to
163
+ leave this unless you notice that you're not getting a sufficient number of contexts.</ p >
164
+
165
+ < p > The < code > Contexts</ code > setting is more fun to play with. Here you can control the number of chunks that will be
166
+ forwarded to the LLM along with your question, for a response. HOWEVER, make sure and read my instructions above about how
167
+ to ensure that the LLM does not exceed its maximum context limit; otherwise, it'll give an error.</ p >
168
+
159
169
< h2 > Break in Case of Emergency</ h2 >
160
170
< p > All of the settings are kept in a < code > config.yaml</ code > file. If you accidentally change a setting you don't like or
161
171
its deleted or corrupted somehow, inside the "User Guide" folder I put a backup of the original file.</ p >
Original file line number Diff line number Diff line change @@ -164,10 +164,15 @@ <h2 style="color: #f0f0f0;" align="center">Ask the Right Questions</h2>
164
164
"What is the statute of limitations for defamation?" versus "What is the statute of limitations for a defamation
165
165
action if the allegedly defamatory statement is in writing as opposed to verbal?" Experiment with how specific you are.</ p >
166
166
167
- < p > Don't use multiple questions. For example, the results will be poor if you ask "What is the statute of limitations for a
167
+ < p > My previous advice was to not ask multiple questions, but now that I've added an option to increase the number of
168
+ "contexts" from the database to the LLM, this is less stringent. I now encourage you ask longer-winded questions and even
169
+ general descriptions of the types of information you're looking for (not strictly a question you see). For reference, here
170
+ are my prior instructions:</ p >
171
+
172
+ < p > < i > Don't use multiple questions. For example, the results will be poor if you ask "What is the statute of limitations for a
168
173
defamation action?" AND "Can the statute of limitations tolled under certain circumstances?" at the same time. Instead,
169
174
reformulate your question into something like: "What is the statute of limitations for a defamation and can it be tolled
170
- under certain circumstances?" Again, just experiment and DO NOT assume that you must use a larger LLM or embedding model.</ p >
175
+ under certain circumstances?" Again, just experiment and DO NOT assume that you must use a larger LLM or embedding model.</ i > </ p >
171
176
172
177
< h2 style ="color: #f0f0f0; " align ="center "> Ensure Sufficient Context Length for the LLM</ h2 >
173
178
Original file line number Diff line number Diff line change @@ -103,6 +103,14 @@ <h1>Whisper Quants</h1>
103
103
104
104
< main >
105
105
106
+ < h2 > As of Version 2.5</ h2 >
107
+
108
+ < p > ALL transcriber settings have been moved to the GUI so they can be changed dynamically, easily. Therefore, the instructions
109
+ pertaining to modifying scripts to change them no longer applies. ALSO, no need to worry about which quants are available
110
+ on your CPU/GPU because the program will automatically detect compatible quants and only display those that are compatible!
111
+ I'm leaving the instructions below unchanged, however, to get this release out. You can still reference them for purposes
112
+ of understanding what the different settings represent or what not.</ p >
113
+
106
114
< h2 > Changing Model Size and Quantization</ h2 >
107
115
108
116
< p > The < code > base.en</ code > model with in < code > float32</ code > format is used by default. To use a different model size,
You can’t perform that action at this time.
0 commit comments