Appearance
Elsai VectorDB
Package: elsai-vectordb v2.2.1
Interfaces for ChromaDB, Pinecone, Weaviate, and FAISS — with consistent add/retrieve/delete APIs and direct retriever integration for RAG workflows.
Installation
bash
pip install --extra-index-url https://core-packages.elsai.ai/root/elsai-vectordb/ elsai-vectordb==2.2.1Requirements: Python >= 3.9
Available stores
| Class | Import path | Backend |
|---|---|---|
ChromaVectorDb | elsai_vectordb.chromadb | ChromaDB (local persistent) |
PineconeVectorDb | elsai_vectordb.pinecone | Pinecone (cloud) |
WeaviateVectorDb | elsai_vectordb.weaviate | Weaviate (local or cloud) |
FaissVectorDb | elsai_vectordb.FAISS.faiss_vectordb | FAISS (local in-memory/persistent) |
ChromaVectorDb
Local persistent vector store backed by ChromaDB.
python
from elsai_vectordb.chromadb import ChromaVectorDb
db = ChromaVectorDb(persist_directory="./chroma_db")
db.create_if_not_exists(collection_name="my_docs")
# Add a document
db.add_document(
document={
"id": "001",
"embeddings": [0.1, 0.2, 0.7],
"page_content": "This is a sample document.",
"metadatas": {"source": "report.pdf", "file_id": "doc1"},
},
collection_name="my_docs",
)
# Retrieve — simple
results = db.retrieve_document(
collection_name="my_docs",
embeddings=[0.1, 0.2, 0.7],
k=5,
)
# Retrieve — with filter (v2.1.0+)
results = db.retrieve_document(
collection_name="my_docs",
embeddings=[0.1, 0.2, 0.7],
where={"file_id": {"$eq": "doc1"}},
k=5,
)
# Complex filter
results = db.retrieve_document(
collection_name="my_docs",
embeddings=[0.1, 0.2, 0.7],
where={"$and": [{"user_id": {"$eq": "123"}}, {"file_id": {"$in": ["doc1"]}}]},
k=5,
)
# Retrieve chunks by file ID
chunks = db.fetch_chunks(collection_name="my_docs", files_id="doc1")Constructor parameters:
| Parameter | Description |
|---|---|
persist_directory | Directory path where ChromaDB stores data on disk |
CRUD operations (v1.1.0+)
python
# Update
db.update_document(
document={"id": "001", "embeddings": [...], "page_content": "Updated text.", "metadatas": {...}},
collection_name="my_docs",
)
# Delete by ID
db.delete_document(ids=["001", "002"], collection_name="my_docs")
# Delete by filter
db.delete_document(where={"file_id": "doc1"}, collection_name="my_docs")
# List / delete collections
collections = db.list_collections()
db.delete_collection(collection_name="my_docs")As retriever for RAG (v2.0.0+)
python
retriever = db.as_retriever(
collection_name="my_docs",
embedding_model=your_embedding_model,
)Environment variables: CHROMA_PERSIST_DIRECTORY
PineconeVectorDb
Cloud-hosted vector search via Pinecone.
python
from elsai_vectordb.pinecone import PineconeVectorDb
db = PineconeVectorDb(
index_name="my-index",
pinecone_api_key="your_key",
dimension=1536,
)
# Add a document
db.add_document(
document={
"id": "001",
"embeddings": [0.1, 0.2, ...], # must match dimension
"page_content": "Sample text.",
"metadatas": {"file_id": "doc1"},
},
namespace="my-namespace",
)
# Retrieve with filter (v2.1.0+)
results = db.retrieve_document(
namespace="my-namespace",
question_embedding=[0.1, 0.2, ...],
filter={"file_id": {"$eq": "doc1"}},
k=5,
)
# Update / delete
db.update_document(document={...}, namespace="my-namespace")
db.delete_document(ids=["001"], namespace="my-namespace")
db.delete_document(filter={"file_id": "doc1"}, namespace="my-namespace")
# Namespaces
namespaces = db.list_namespaces()
db.delete_namespace(namespace="my-namespace")Constructor parameters:
| Parameter | Description |
|---|---|
index_name | Pinecone index name |
pinecone_api_key | Pinecone API key |
dimension | Vector dimensionality — must match your embedding model (e.g. 1536 for text-embedding-ada-002) |
As retriever for RAG
python
retriever = db.as_retriever(
namespace="my-namespace",
embedding_model=your_embedding_model,
)Environment variables: PINECONE_API_KEY
WeaviateVectorDb
Vector store backed by Weaviate — supports both local and cloud deployments.
python
from elsai_vectordb.weaviate import WeaviateVectorDb
# Local Weaviate instance
db = WeaviateVectorDb(
connection_type="local",
host="localhost",
port=8080,
collection_name="Documents",
schema={
"content": "text",
"source": "text",
"file_id": "text",
},
)
# Cloud Weaviate instance
db = WeaviateVectorDb(
connection_type="cloud",
host="https://your-instance.weaviate.network",
port=443,
collection_name="Documents",
schema={
"content": "text",
"source": "text",
},
)
# Add a document with its embedding vector
db.add_context(
data={"content": "Sample document text.", "source": "report.pdf"},
vector=[0.1, 0.2, 0.7, ...],
)
# Semantic search by vector
results = db.get_context_by_vector(query_vector=[0.1, 0.2, 0.7, ...], limit=5)
# Retrieve with filter
results = db.get_last_n_chats_by_filter(filter={"file_id": "doc1"}, limit=10)
# Delete by filter
db.delete_chats_by_filter(filter={"file_id": "doc1"})
# Single record operations
record = db.get_object_by_uuid(uuid="your-uuid", include_vector=True)
db.update_object_by_uuid(uuid="your-uuid", data={"content": "Updated text."})
db.delete_object_by_uuid(uuid="your-uuid")
# Collection and connection management
db.delete_collection()
db.close()Constructor parameters:
| Parameter | Description |
|---|---|
connection_type | "local" for a self-hosted instance or "cloud" for Weaviate Cloud |
host | Server address (hostname or full URL for cloud) |
port | Connection port (typically 8080 for local, 443 for cloud) |
collection_name | Name of the data collection to use |
schema | Dictionary mapping field names to their types (e.g. {"content": "text"}) |
As retriever for RAG
python
retriever = db.as_retriever()Returns a NativeWeaviateRetriever compatible with RAG workflows.
FaissVectorDb
Local vector store backed by FAISS — supports flat and IVF index types with configurable distance metrics.
python
from elsai_vectordb.FAISS.faiss_vectordb import FaissVectorDb
db = FaissVectorDb(
persist_directory="./faiss_db",
dimension=1536,
index_type="flat", # "flat" for exact search, "ivf" for approximate
metric="cosine", # "cosine" or "l2"
)
db.create_if_not_exists(collection_name="my_docs")
# Add a single document
db.add_document(
document={
"id": "001",
"embeddings": [0.1, 0.2, ...],
"page_content": "Sample document text.",
"metadatas": {"file_id": "doc1"},
},
collection_name="my_docs",
)
# Bulk insertion
db.add_documents(
batch=[
{"id": "002", "embeddings": [...], "page_content": "Doc 2.", "metadatas": {"file_id": "doc2"}},
{"id": "003", "embeddings": [...], "page_content": "Doc 3.", "metadatas": {"file_id": "doc3"}},
],
collection_name="my_docs",
)
# Retrieve — with optional file filter
results = db.retrieve_document(
collection_name="my_docs",
embeddings=[0.1, 0.2, ...],
files_id="doc1", # optional — restrict results to a specific file
k=5,
)
# Fetch all chunks for a file
chunks = db.fetch_chunks(collection_name="my_docs", files_id="doc1")
# Update / delete
db.update_document(id="001", document={...}, collection_name="my_docs")
db.delete_document(id="001", collection_name="my_docs")
# Collections
collections = db.list_collections()
db.delete_collection(collection_name="my_docs")Constructor parameters:
| Parameter | Description |
|---|---|
persist_directory | Directory path where FAISS index files are stored |
dimension | Vector dimensionality — must match your embedding model |
index_type | "flat" for exact nearest-neighbour search; "ivf" for approximate (faster on large datasets) |
metric | Distance metric: "cosine" or "l2" |
nlist | Number of IVF clusters (only applies when index_type="ivf") |
nprobe | Number of clusters searched per query (only applies when index_type="ivf") — higher values give better recall at the cost of speed |
As retriever for RAG
python
retriever = db.as_retriever(
collection_name="my_docs",
embedding_model=your_embedding_model,
)Version history
| Version | Changes |
|---|---|
| 2.2.1 | Current stable release |
| 2.1.0 | Filter-based retrieve_document for ChromaDB and Pinecone |
| 2.0.0 | as_retriever() — seamless integration with elsai-retrievers |
| 1.1.0 | update_document, delete_document, list_collections, delete_collection |
| 1.0.0 | Initial release with ChromaDB, Pinecone, Weaviate |