Elsai VectorDB v1.0.0#
The Elsai VectorDB package provides interfaces to work with vector databases like ChromaDB and Pinecone, enabling efficient storage and retrieval of document embeddings with enhanced retriever integration for v1.0.0.
Prerequisites#
- Python >= 3.9 
- .env file with appropriate API keys and configuration variables 
Installation#
To install the elsai-vectordb package:
pip install --index-url https://elsai-core-package.optisolbusiness.com/root/elsai-vectordb/ elsai-vectordb==1.0.0
Components#
1. ChromaVectorDb#
ChromaVectorDb is a wrapper around ChromaDB to manage local document embeddings with persistent storage.
from elsai_vectordb.chromadb import ChromaVectorDb
chroma_client = ChromaVectorDb(persist_directory="your_persist_directory") # Or set in environment variable CHROMA_PERSIST_DIRECTORY
chroma_client.create_if_not_exists(collection_name="your_collection_name")
document = {
    "id": "001",
    "embeddings": [0.1, 0.2, 0.7],  # Example embedding vector
    "page_content": "This is a sample document.",
    "metadatas": {"source": "example_source", "file_id": "doc1"}
}
chroma_client.add_document(document=document, collection_name="your_collection_name")
documents = chroma_client.retrieve_document(
    collection_name="your_collection_name",
    embeddings=[0.1, 0.2, 0.7],
    files_id=["doc1"],
    k=5
)
collection = chroma_client.get_collection(collection_name="your_collection_name")
chunks = chroma_client.fetch_chunks(collection_name="your_collection_name", files_id=["doc1"])
chroma_client.delete_collection(collection_name="your_collection_name")
# Use ChromaDB as a retriever for RAG workflows
retrievers = chroma_client.as_retriever(
    collection_name="your_collection_name",
    embedding_model="your_embedding_model_instance"
)
Required Environment Variables:
- CHROMA_PERSIST_DIRECTORY– Path to the directory where ChromaDB will persist data locally
2. PineconeVectorDb#
PineconeVectorDb integrates with Pinecone to manage vector search using cloud-hosted infrastructure.
from elsai_vectordb.pinecone import PineconeVectorDb
pinecone_client = PineconeVectorDb(
    index_name="testingindex",
    pinecone_api_key="pinecone_api_key",  # Or set in environment variable PINECONE_API_KEY
    dimension=1536  # Example dimension size
)
pinecone_client.add_document(
    document={
        "id": "001",
        "embeddings": [0.1, 0.2, 0.7],  # Replace with a 1536-dimension vector
        "page_content": "This is a sample document.",
        "metadatas": {"source": "example_source", "file_id": "doc1"}
    },
    namespace="namespacename"
)
results = pinecone_client.retrieve_document(
    namespace="namespacename",
    question_embedding=[0.1, 0.2, 0.7],
    files_id=["doc1"],
    k=5
)
# Use Pinecone as a retriever for RAG workflows
retrievers = pinecone_client.as_retriever(
    namespace="namespacename",
    embedding_model="your_embedding_model_instance"
)
Required Environment Variables:
- PINECONE_API_KEY– API key to authenticate with Pinecone vector DB
