Elsai Graph Generator#

The Elsai Graph Generator package provides a comprehensive solution for managing knowledge graphs in Neo4j. It handles database connections, graph storage, and vector-based similarity searches using embeddings.

Prerequisites#

  • Python >= 3.10

  • Neo4j database instance (local or hosted)

  • Embedding Service (Azure OpenAI, Amazon Bedrock, or Elsai Core)

  • .env file with Neo4j and Embedding provider credentials

Installation#

To install the elsai-graph-generator package:

pip install --extra-index-url https://elsai-core-package.optisolbusiness.com/root/elsai-graph-generator/ elsai-graph-generator==0.1.0

Components#

1. Neo4jConnector#

The Neo4jConnector manages the connection to your Neo4j database. It handles opening connections, executing Cypher queries, and retrieving database statistics.

from elsai_graph_generator import Neo4jConnector

connector = Neo4jConnector(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="your-password",
    database="neo4j",
    verbose=True
)

# Connect to the database
success, message = connector.connect()

# Execute a custom query
success, message, results = connector.execute_query("MATCH (n) RETURN count(n) AS node_count")

# Get database statistics
stats = connector.get_database_stats()

Key Responsibilities:

  • Establishes and manages Neo4j sessions.

  • Executes Cypher queries safely.

  • Provides metadata about labels and relationship types.

  • Handles database cleanup and index creation.

2. GraphStorage#

The GraphStorage class is responsible for saving and managing graph data within Neo4j. It simplifies the process of storing complex entities and their relationships.

from elsai_graph_generator import GraphStorage

storage = GraphStorage(connector, verbose=True)

graph_data = {
    'nodes': [
        {'id': 'John', 'label': 'John', 'type': 'Person', 'description': 'Engineer'},
        {'id': 'Optisol', 'label': 'Optisol', 'type': 'Organization', 'description': 'Tech Company'}
    ],
    'edges': [
        {'from': 'John', 'to': 'Optisol', 'type': 'WORKS_AT', 'label': 'Works At'}
    ]
}

# Store the entire graph
success, message = storage.store_graph(graph_data, clear_existing=True)

# Add individual nodes or edges
storage.add_node({'id': 'C++', 'label': 'C++', 'type': 'Skill'})
storage.add_edge({'from': 'John', 'to': 'C++', 'type': 'HAS_SKILL'})

Key Responsibilities:

  • Store Graph: Batch stores nodes and relationships.

  • Add Node/Edge: Dynamically updates the graph with new elements.

  • Clear Database: Removes all data to reset the graph state.

3. EmbeddingManager#

The EmbeddingManager handles the generation and management of vector embeddings for graph entities. It enables powerful similarity search capabilities within the knowledge graph.

from elsai_graph_generator import EmbeddingManager

embedding_manager = EmbeddingManager(connector, azure_embedding_model, verbose=True) #Supports Azure OpenAI embedding models, elsai-embeddings and bedrock embedding models

# Create a vector index in Neo4j
embedding_manager.create_vector_index("entity_embeddings")

# Generate embeddings for existing nodes
embedding_manager.embed_nodes(batch_size=10)

# Find similar nodes based on a text query
success, message, results = embedding_manager.find_similar_nodes(
    query_text="software engineer",
    top_k=5
)

Key Responsibilities:

  • Vector Indexing: Sets up Neo4j vector indexes for fast retrieval.

  • Embedding Generation: Converts text descriptions into 1536-dimensional vectors using Azure OpenAI’s text-embedding-ada-002.

  • Similarity Search: Performs semantic searches to find related entities even without direct relationships.