Skip to content

Commit e2d4f64

Browse files
authored
v2.0.3
1 parent 2aca9cf commit e2d4f64

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

src/User_Manual/vision.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ <h2 style="color: #f0f0f0;" align="left">Which Vision Models Are Available?</h2>
113113
<p><code>llava</code> models were trailblazers in what they did and this program uses both the 7b and 13b sizes.
114114
<code>llava</code> models are based on the <code>llama2</code> architecture. <code>bakllava</code> is similar to
115115
<code>llava</code> except that it's architecture is based on <code>mistral</code> and only comes in the 7b variety.
116-
<code>cogvlm</code> has <u>18b parameters</u> but is my personal favorite because it produces the bset results by far. Its
116+
<code>cogvlm</code> has <u>18b parameters</u> but is my personal favorite because it produces the best results by far. Its
117117
accuracy is over 90% in the statements its summaries I've found whereas <code>bakllava</code> is only about 70% and
118118
<code>llava</code> is slightly lower than that (regardless of whether you use the 7b or 13b sizes).</p>
119119

@@ -149,7 +149,7 @@ <h2 style="color: #f0f0f0;" align="center">How do I use the Vision Model?</h2>
149149

150150
<p>The "loading" process takes very little time for documents but a relatively long time for images. "Loading" images involves
151151
creating the summaries for each image using the selected vision model. Make sure and test your vision model settings within
152-
the Tools Tab before committing to processing, for example, 100 images.</p>
152+
the Tools Tab before committing to processing 1000 images, for example.</p>
153153

154154
<p>After both documents and images are "loaded" they are added to the vectorstore just the same as prior release of this
155155
program.</p>
@@ -160,7 +160,7 @@ <h2 style="color: #f0f0f0;" align="center">How do I use the Vision Model?</h2>
160160
model settings.</p>
161161

162162
<p>PRO TIP: Make sure and set your chunking settings to larger than the summaries that are provided by the vision model.
163-
Doing this prevents the summary for a particular image from EVER being split. In short, each and every chunk consist of the
163+
Doing this prevents the summary for a particular image from EVER being split. In short, each and every chunk consists of the
164164
<u>entire summary</u> provided by the vision model! This tends to be 400-800 chunk size depending on the vision model
165165
settings.</p>
166166

@@ -176,7 +176,7 @@ <h2 style="color: #f0f0f0;" align="center">Can I Change What the Vision Model Do
176176
</ol>
177177

178178
<p>You can go into these scripts and modify the question sent to the vision model, but make sure the prompt format remains
179-
the same. In future releases I will likely add the functionality to experiement with different questions within the
179+
the same. In future releases I will likely add the functionality to experiment with different questions within the
180180
grapical user interface to achieve better results.</p>
181181

182182
</main>

0 commit comments

Comments
 (0)