Elsai Utilities#

The Elsai Utilities package provides helper classes for chunking and converting documents for use in retrieval-augmented generation (RAG) and vector database ingestion pipelines.

Prerequisites#

  • Python >= 3.9

Installation#

To install the elsai-utilities package:

pip install --index-url https://elsai-core-package.optisolbusiness.com/root/elsai-utilities/ elsai-utilities==0.1.0

Components#

  1. The DocumentChunker class provides various ways to split text into structured chunks.

    from elsai_utilities.splitters import DocumentChunker
    
    chunker = DocumentChunker()
    
    contents = "# This is the first page.\n\n## This is the second page.\n\n### This is the third page."
    
    # Page-wise chunking
    chunks = chunker.chunk_page_wise(contents=contents, file_name="example.md")
    
    # Markdown header-wise chunking
    markdown_wise_chunks = chunker.chunk_markdown_header_wise(
        text=contents,
        file_name="example.md",
        headers_to_split_on=[("#", "Header 1"), ("##", "Header 2")],
        strip_headers=True
    )
    
    # Recursive character-wise chunking
    text = "This is a long piece of text that should be chunked recursively..."
    recursive_chunks = chunker.chunk_recursive(
        contents=text,
        file_name="example.md",
        chunk_size=50,
        chunk_overlap=10
    )
    
  2. The DocumentConverter class converts LlamaIndex documents into LangChain-compatible Document objects.

    from elsai_utilities.converters import DocumentConverter
    
    converter = DocumentConverter()
    
    llama_index_document = {
        "text_resource": {
            "text": "This is a sample text extracted from LlamaIndex."
        }
    }
    
    langchain_document = converter.llama_index_to_langchain_document(
        llama_index_document=llama_index_document,
        file_name="example.md"
    )
    
  3. The ConversationalIntelligence class provides comprehensive conversational analysis capabilities including follow-up question generation, action item detection, and topic/intent classification.

    from elsai_utilities.conversational_intelligence import ConversationalIntelligence
    
    # Initialize with your LLM instance (e.g., ChatOpenAI, Claude, etc.)
    ci = ConversationalIntelligence(llm=your_llm_instance)
    
    # Generate follow-up questions
    followup_questions = ci.generate_followup_questions(
        user_question="What is machine learning?",
        answer="Machine learning is a subset of AI that enables computers to learn from data.",
        context=["Previous discussion about AI", "User is a beginner"],
        num_questions=3
    )
    
    # Safe version with fallback questions
    safe_questions = ci.generate_followup_questions_safe(
        user_question="What is machine learning?",
        answer="Machine learning is a subset of AI...",
        num_questions=3,
        fallback_questions=["Can you tell me more?", "What are some examples?"]
    )
    
    # Detect action items from conversation
    messages = [
        "John, can you prepare the quarterly report by Friday?",
        "Sure, I'll have it ready. Should I include the budget analysis?",
        "Yes, and make sure to highlight the key metrics."
    ]
    
    action_items = ci.detect_action_items(
        messages=messages,
        include_context=True,
        extract_priority=True,
        extract_assignee=True,
        extract_due_date=True,
        min_confidence=0.7
    )
    
    # Detect topics and intents
    topic_intent_result = ci.detect_topics_and_intents(
        messages=messages,
        detect_topics=True,
        detect_intents=True,
        min_confidence=0.6,
        max_topics=5,
        max_intents=3
    )
    
    # Detect only topics
    topics = ci.detect_topics_only(
        messages=messages,
        min_confidence=0.6,
        max_topics=5
    )
    
    # Detect only intents
    intents = ci.detect_intents_only(
        messages=messages,
        min_confidence=0.6,
        max_intents=3
    )
    
    # Comprehensive conversation analysis
    analysis = ci.analyze_conversation(
        messages=messages,
        include_followup=True,
        include_actions=True,
        include_topics=True,
        include_intents=True
    )
    
    # Get conversation summary
    summary = ci.get_conversation_summary(messages=messages)
    

    Follow-up Question Generation - Generates contextually relevant follow-up questions - Supports conversation context and history - Includes safe fallback mechanisms - Validates question format and content

    Action Item Detection - Extracts actionable tasks from conversations - Identifies assignees, due dates, and priorities - Provides confidence scoring - Supports context extraction

    Topic and Intent Detection - Identifies main conversation topics - Classifies user intents and purposes - Supports confidence thresholds - Provides keyword extraction and categorization

    Comprehensive Analysis - Combines all intelligence features - Provides conversation summaries - Offers flexible configuration options - Returns structured, validated results

Return Types#

The ConversationalIntelligence component returns structured objects:

  • ActionItem Contains task, assignee, due_date, priority, context, and source_message.

  • DetectedTopic Contains name, confidence, keywords, category, and context.

  • DetectedIntent Contains intent_type, confidence, entities, intent_classification, context, and source_message.

  • TopicIntentResponse Contains lists of detected topics and intents.