Releases: BBC-Esq/VectorDB-Plugin
v3.0.6 - SHOWTIME! (fixing again)
Turns out my feeble attempt to use unstructured to process both docx and doc files requires libre office...which I don't want to require as a dependency.
Reverting to using former loader an removing support for .doc files.
v3.05 - SHOWTIME! (doc/docx)
Fixed loading of docx files...and added loading of .doc files as well. Switched to "Unstructured" doc/docx loader instead.
v3.04 - SHOWTIME! (new vision)
Added Salesforce vision model for quick and dirty image summaries for the vector database.
Created a mechanism to automatically backup the DB when it's created, and restore each time the program starts - obviating the issue of it sometimes not being persisted beyond one session.
Minor refactoring.
v3.0.3 - SHOWTIME! (bug fixes)
Fixed replace_pdf.py and updated instructions - make sure and read updated instructions for LInux and Mac.
Disabled vision models for MacOS until I can figure out how to run them at a basic level on MacOS.
Several other bug fixes that most people wouldn't encounter...
Fixed a bug that threw an error is a user tried to only add images to the vector database without first trying to use a non-image...In short, the Docs_for_DB folder wasn't being created, which caused the error. Now the Docs_for_DB and Images_for_DB are automatically created if they don't already exist, when the program starts.
V3.0.2 - SHOWTIME!
ERRATTA:
Re-release. LINUX AND MACOS users need to run python replace_pdf.py
after all of the other installation instructions. See the update instructions on the github readme.
CREDIT goes to cddigi for alerting me to the situation and finding a solution.
There is one problem reported with MacOS users thus far, possibly regarding vision_llava_module.py
and loader_vision_llava.py
scripts. I don't have the hardware to test anything other than Windows + Nvidia, so please contact me if you have a genuine interest in this project and I"ll help you troubleshoot and/or revise my scripts for Linux/Mac.
Also, if anyone gets an error similar to this. Let me now ASAP please, but all issues should be fixed now regarding this specifically.
V3.0.1 - SHOWTIME!
ERRATTA re v3.0.0 release. This now includes revised scripts to ensure that my custom pdf loader named pdf.py correctly replaces Langchain's original pdf.py. Change was necessary due to differing directory structures on Windows, Linux, and MacOS.
Contact me ASAP if it still doesn't work.
Requires LINUX AND MACOS users to run python replace_pdf.py
after all of the other installation instructions. See the update instructions on the github readme.
CREDIT goes to cddigi for alerting me to the situation and finding a solution. It would have been his pull request that solved but I rushed to get a fix before reviewing his.
Added visions models that can put documents into the vector database.
Debugging regarding errors related to pynvml library on mac or other computers that were receiving errors relating to setting the compute device for transcription (when no nvidia gpu available).
Refactored multiple scripts.
Modified choose documents to allow selecting multiple image files as well.
Added testing for vision models.
Added numerous helpful messages.
If anyone gets an error similar to this:
Try replacing line line 19
of gui_tabs_tools_transcribe
with these two lines instead:
gpu_brand = compute_device_config.get('gpu_brand', '')
self.gpu_brand = gpu_brand.lower() if gpu_brand is not None else ''
I think a person downloaded the wrong file or was using an old one, but another person gets this error I may have to re-release...again...
v3.0.0 - SHOWTIME!!
WINDOWS, Linux, and MacOS users:
- take the extra
initialize.py
,replace_pdf.py
, andsetup.py
files below and replace the ones in thesrc folder
.
WINDOWS users: ONLY RUN the setup.py
below once you've made the necessary replacements.
LINUX and MacOs users: FIRST RUN replace_pdf.py
before trying to start the program. The updated instructions are also on the github readme.
Added visions models that can put documents into the vector database.
Debugging regarding errors related to pynvml library on mac or other computers that were receiving errors relating to setting the compute device for transcription (when no nvidia gpu available).
Refactored multiple scripts.
Modified choose documents to allow selecting multiple image files as well.
Added testing for vision models.
Added numerous helpful message.
V2.7.5 - BARK re-RELEASE
Re-releasing due to some errors...Here's the synopsis of this re-release as previously stated!
Perfected the Bark model with options to choose from.
Corrected view database docs folder to work with all OS.
Corrected some bugs, still hunting a few others that someone added...
Major update to loaders to specify parameters.
Updated Read me re Bark and in general.
v2.7.3 - WINDOWS Bark Fun-ness!!
Marking this as Windows-only since I haven't vetted it for Linux/MacOS...So can't guarantee it'll even work on those systems. This release is marked as "funness" because it's not strictly necessary for using the vector database. If you want something more stable, stick to v2.7.2 until a fully-compatible release is done.
The main addition is using Bark models to speak the response from the LLM. You can only use this after a response has been received from the LLM - i.e. not when using the "test embeddings" checkbox.
Still debugging...so no GUI settings to change. SEE the inline instructions within bark_module.py
script itself on how to customize the size of the Bark models used, float32/float16, etc.
v2.7.2 - 80x fast PDF loading
Linux/Mac users MUST read the updated instructions on github for PDF loading to work at all.
Huge shoutout to "the beef" for implementing streaming.
PDFs now load 80x faster by using PyMuPDF loader instead of pdfminer.six. HOWEVER, this requires a custom script of mine modifying Lanchain's source code.
-Windows users: This is done automatically when running setup.py
.
-Mac/Linux users: READ THE UPDATED INSTRUCTIONS ON HOW TO MOVE THE APPROPRIATE FILE.
"The beef" also added a button to view the docs in the database folder so you can more easily add/remove them. Remember, you must recreate the database afterwards.
ADDIING AFTER RELEASE:
(1) new server_connector.py
. Use this to replace the one in the official release in order to get citations along with the response from the LLM.
(2) adding corrected metrics_bar.py
to fix bug related to MacOS not working because the program tries to use an NVIDIA-specific library.