Elsai Graph Generator#
The Elsai Graph Generator package provides a comprehensive solution for managing knowledge graphs in Neo4j. It handles database connections, graph storage, and vector-based similarity searches using embeddings.
Prerequisites#
Python >= 3.10
Neo4j database instance (local or hosted)
Embedding Service (Azure OpenAI, Amazon Bedrock, or Elsai Core)
.env file with Neo4j and Embedding provider credentials
Installation#
To install the elsai-graph-generator package:
pip install --extra-index-url https://elsai-core-package.optisolbusiness.com/root/elsai-graph-generator/ elsai-graph-generator==0.1.0
Components#
1. Neo4jConnector#
The Neo4jConnector manages the connection to your Neo4j database. It handles opening connections, executing Cypher queries, and retrieving database statistics.
from elsai_graph_generator import Neo4jConnector
connector = Neo4jConnector(
uri="bolt://localhost:7687",
user="neo4j",
password="your-password",
database="neo4j",
verbose=True
)
# Connect to the database
success, message = connector.connect()
# Execute a custom query
success, message, results = connector.execute_query("MATCH (n) RETURN count(n) AS node_count")
# Get database statistics
stats = connector.get_database_stats()
Key Responsibilities:
Establishes and manages Neo4j sessions.
Executes Cypher queries safely.
Provides metadata about labels and relationship types.
Handles database cleanup and index creation.
2. GraphStorage#
The GraphStorage class is responsible for saving and managing graph data within Neo4j. It simplifies the process of storing complex entities and their relationships.
from elsai_graph_generator import GraphStorage
storage = GraphStorage(connector, verbose=True)
graph_data = {
'nodes': [
{'id': 'John', 'label': 'John', 'type': 'Person', 'description': 'Engineer'},
{'id': 'Optisol', 'label': 'Optisol', 'type': 'Organization', 'description': 'Tech Company'}
],
'edges': [
{'from': 'John', 'to': 'Optisol', 'type': 'WORKS_AT', 'label': 'Works At'}
]
}
# Store the entire graph
success, message = storage.store_graph(graph_data, clear_existing=True)
# Add individual nodes or edges
storage.add_node({'id': 'C++', 'label': 'C++', 'type': 'Skill'})
storage.add_edge({'from': 'John', 'to': 'C++', 'type': 'HAS_SKILL'})
Key Responsibilities:
Store Graph: Batch stores nodes and relationships.
Add Node/Edge: Dynamically updates the graph with new elements.
Clear Database: Removes all data to reset the graph state.
3. EmbeddingManager#
The EmbeddingManager handles the generation and management of vector embeddings for graph entities. It enables powerful similarity search capabilities within the knowledge graph.
from elsai_graph_generator import EmbeddingManager
embedding_manager = EmbeddingManager(connector, azure_embedding_model, verbose=True) #Supports Azure OpenAI embedding models, elsai-embeddings and bedrock embedding models
# Create a vector index in Neo4j
embedding_manager.create_vector_index("entity_embeddings")
# Generate embeddings for existing nodes
embedding_manager.embed_nodes(batch_size=10)
# Find similar nodes based on a text query
success, message, results = embedding_manager.find_similar_nodes(
query_text="software engineer",
top_k=5
)
Key Responsibilities:
Vector Indexing: Sets up Neo4j vector indexes for fast retrieval.
Embedding Generation: Converts text descriptions into 1536-dimensional vectors using Azure OpenAI’s text-embedding-ada-002.
Similarity Search: Performs semantic searches to find related entities even without direct relationships.