PrivateGPT is a robust tool designed for local document querying, eliminating the need for an internet connection. Whether you're a seasoned researcher, a developer, or simply eager to explore document querying solutions, PrivateGPT offers an efficient and secure solution to meet your needs. To assist you in getting started, we've created an accompanying YouTube video that provides a detailed, step-by-step installation demonstration!
Built on the foundation of Langchain, PrivateGPT supports both GPT4ALL and LLAMA-CPP models, making it a versatile choice for a wide range of applications.
PrivateGPT currently supports running on CPU only.
https://github.com/imartinez/privateGPT After cloning go inside the PrivateGPT folder in the terminal or open it with the code editor
Run Python command to create a virtual environment
1python -m venv venv
Activating virtual env on Mac or Linux
1source venv/bin/activate
Activating virtual env on Windows
1venv\Scripts\activate
To deactivate the virtual environment use this deactivate command
1deactivate
Install dependency for PrivateGPT A list of required packages to run PrivateGPT are stored in requirements.txt, If you want to install multiple packages at once or store the package version for future use, you can keep the package name with the version in a text file, name of the text file can be anything.
1pip install -r requirements.txt
PrivateGPT supports GPT4ALL and LLAMA models, you can download the GPT4ALL models from here. To download LLAMA models go to https://huggingface.co/meta-llama , choose the model based on your computer resources.
Places the downloaded model in the models folder inside PrivateGPT
GPT4ALL models https://huggingface.co/nomic-ai/gpt4all-falcon-ggml/resolve/main/ggml-model-gpt4all-falcon-q4_0.bin https://gpt4all.io/models/wizardlm-13b-v1.1-superhot-8k.ggmlv3.q4_0.bin
LLAMA models Llama-2-7b-hf Llama-2-7b-chat
Create a new .env file and copy the values from example.env,
Config to run GPT4All
1PERSIST_DIRECTORY=db 2MODEL_TYPE=GPT4All 3MODEL_PATH=models/wizardlm-13b-v1.1-superhot-8k.ggmlv3.q4_0.bin 4EMBEDDINGS_MODEL_NAME=all-MiniLM-L6-v2 5MODEL_N_CTX=1000 6MODEL_N_BATCH=8 7TARGET_SOURCE_CHUNKS=4
Config to run LlamaCpp
1PERSIST_DIRECTORY=db 2MODEL_TYPE=LlamaCpp 3MODEL_PATH=models/Llama-2-7b-hf.bin 4EMBEDDINGS_MODEL_NAME=all-MiniLM-L6-v2 5MODEL_N_CTX=1000 6MODEL_N_BATCH=8 7TARGET_SOURCE_CHUNKS=4
source_documents is where you have to put all the files to use in PrivateGPT, it comes with the default file, remove the default file and all your files into the source_documents directory.
The supported source document extensions are:
.csv
: CSV,.docx
: Word Document,.doc
: Word Document,.enex
: EverNote,.eml
: Email,.epub
: EPub,.html
: HTML File,.md
: Markdown,.msg
: Outlook Message,.odt
: Open Document Text,.pdf
: Portable Document Format (PDF),.pptx
: PowerPoint Document,.ppt
: PowerPoint Document,.txt
: Text file (UTF-8),Run the command to ingest all the data, this will create a vector data
1python ingest.py
Run the command to ask question
1python privateGPT.py