Skip to content

Commit 918b5cc

Browse files
authored
v2.6.3
1 parent b884519 commit 918b5cc

File tree

1 file changed

+2
-43
lines changed

1 file changed

+2
-43
lines changed

src/User_Manual/tips.html

Lines changed: 2 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -117,34 +117,9 @@ <h2 style="color: #f0f0f0;" align="center">Manage VRAM</h2>
117117
and re-enable the graphics adapters on your motherboard. The name of the specific setting can vary so check your specific
118118
motherboard's documentation.</p>
119119

120-
<h2 style="color: #f0f0f0;" align="center">Select an Appropriate Embedding Model</h2>
120+
<h2 style="color: #f0f0f0;" align="center">See The User Guide Embedding Models Button</h2>
121121

122-
<p>Previously, the <code>instructor</code> models performed best IMHO. However, I have since corrected a mistake in
123-
my code and now recommend using the <code>BGE v1.5</code> models for 90% of use cases. They perform just as good and
124-
use less memory. Also, <code>all-mpnet-base-v2</code> is good and is low-memory. Here are some resources to read:</p>
125-
126-
<p><b>https://www.sbert.net/docs/pretrained_models.html</b></p>
127-
<p><b>https://instructor-embedding.github.io/</b></p>
128-
<p><b>https://github.com/FlagOpen/FlagEmbedding</b></p>
129-
<p><b>https://huggingface.co/thenlper/gte-large</b></p>
130-
<p><b>https://huggingface.co/jinaai/jina-embedding-l-en-v1</b></p>
131-
132-
<h2 style="color: #f0f0f0;" align="center">Select the Appropriate Model Within LM Studio</h2>
133-
134-
<p>My program uses the embedding model to create the database and subsequently obtain "context" from it, which is
135-
then forwarded to the LLM within LM Studio along with your question, for an answer.
136-
The embedding model (not the LLM) is responsible for the quality of the context and it is overwhelmingly this quality
137-
that determines the quality of the answer you get from LM Studio. Therefore, if VRAM is short, prioritize a higher
138-
quality embedding model over a larger LLM. Even a 7B model quantized to 8-bit can be overkill.</p>
139-
140-
<p>This is the smallest model that still works decently IMHO, but my current overall favorite is Mistral:</p>
141-
142-
<p><b>https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v0.3-GGUF</b></p>
143-
144-
<p>There's one caveat, if the documents in your vector database are highly technical (e.g. medical or legal documents),
145-
a larger LLM might provide some benefit because of its increased vocabulary. Just experiment.</p>
146-
147-
<p>Also, remember that my program only supports llama-based models that follow the llama "prompt format."</p>
122+
<p>See the new Embedding Models portion of the User Guide.</p>
148123

149124
<h2 style="color: #f0f0f0;" align="center">Select the Appropriate Transcription Model and Quantization</h2>
150125

@@ -163,22 +138,6 @@ <h2 style="color: #f0f0f0;" align="center">Load LM Studio After Creating Databas
163138
querying it; therefore, don't load a model into LM Studio until after creating the database.
164139
This will reduce the chance that you run out of VRAM when creating the database.</p>
165140

166-
<h2 style="color: #f0f0f0;" align="center">Ask the Right Questions</h2>
167-
168-
<p>Modify your question if you don't get a good answer. Sometimes there's a big difference between
169-
"What is the statute of limitations for defamation?" versus "What is the statute of limitations for a defamation
170-
action if the allegedly defamatory statement is in writing as opposed to verbal?" Experiment with how specific you are.</p>
171-
172-
<p>My previous advice was to not ask multiple questions, but now that I've added an option to increase the number of
173-
"contexts" from the database to the LLM, this is less stringent. I now encourage you ask longer-winded questions and even
174-
general descriptions of the types of information you're looking for (not strictly a question you see). For reference, here
175-
are my prior instructions:</p>
176-
177-
<p><i>Don't use multiple questions. For example, the results will be poor if you ask "What is the statute of limitations for a
178-
defamation action?" AND "Can the statute of limitations tolled under certain circumstances?" at the same time. Instead,
179-
reformulate your question into something like: "What is the statute of limitations for a defamation and can it be tolled
180-
under certain circumstances?" Again, just experiment and DO NOT assume that you must use a larger LLM or embedding model.</i></p>
181-
182141
<h2 style="color: #f0f0f0;" align="center">Ensure Sufficient Context Length for the LLM</h2>
183142

184143
<p>Rarely, the server log within LM Studio might give you an error stating that the context is too long. Increase the maximum

0 commit comments

Comments
 (0)