Elsai Utilities#

The Elsai Utilities package provides helper classes for chunking and converting documents for use in retrieval-augmented generation (RAG) and vector database ingestion pipelines.

Prerequisites#

Python >= 3.9

Installation#

To install the elsai-utilities package:

pip install --extra-index-url https://elsai-core-package.optisolbusiness.com/root/elsai-utilities/ elsai-utilities==0.1.0

Components#

The DocumentChunker class provides various ways to split text into structured chunks.

from elsai_utilities.splitters import DocumentChunker

chunker = DocumentChunker()

contents = "# This is the first page.\n\n## This is the second page.\n\n### This is the third page."

# Page-wise chunking
chunks = chunker.chunk_page_wise(contents=contents, file_name="example.md")

# Markdown header-wise chunking
markdown_wise_chunks = chunker.chunk_markdown_header_wise(
    text=contents,
    file_name="example.md",
    headers_to_split_on=[("#", "Header 1"), ("##", "Header 2")],
    strip_headers=True
)

# Recursive character-wise chunking
text = "This is a long piece of text that should be chunked recursively..."
recursive_chunks = chunker.chunk_recursive(
    contents=text,
    file_name="example.md",
    chunk_size=50,
    chunk_overlap=10
)

The DocumentConverter class converts LlamaIndex documents into LangChain-compatible Document objects.

from elsai_utilities.converters import DocumentConverter

converter = DocumentConverter()

llama_index_document = {
    "text_resource": {
        "text": "This is a sample text extracted from LlamaIndex."
    }
}

langchain_document = converter.llama_index_to_langchain_document(
    llama_index_document=llama_index_document,
    file_name="example.md"
)

The ConversationalIntelligence class provides comprehensive conversational analysis capabilities including follow-up question generation, action item detection, and topic/intent classification.

from elsai_utilities.conversational_intelligence import ConversationalIntelligence

# Initialize with your LLM instance (e.g., ChatOpenAI, Claude, etc.)
ci = ConversationalIntelligence(llm=your_llm_instance)

# Generate follow-up questions
followup_questions = ci.generate_followup_questions(
    user_question="What is machine learning?",
    answer="Machine learning is a subset of AI that enables computers to learn from data.",
    context=["Previous discussion about AI", "User is a beginner"],
    num_questions=3
)

# Safe version with fallback questions
safe_questions = ci.generate_followup_questions_safe(
    user_question="What is machine learning?",
    answer="Machine learning is a subset of AI...",
    num_questions=3,
    fallback_questions=["Can you tell me more?", "What are some examples?"]
)

# Detect action items from conversation
messages = [
    "John, can you prepare the quarterly report by Friday?",
    "Sure, I'll have it ready. Should I include the budget analysis?",
    "Yes, and make sure to highlight the key metrics."
]

action_items = ci.detect_action_items(
    messages=messages,
    include_context=True,
    extract_priority=True,
    extract_assignee=True,
    extract_due_date=True,
    min_confidence=0.7
)

# Detect topics and intents
topic_intent_result = ci.detect_topics_and_intents(
    messages=messages,
    detect_topics=True,
    detect_intents=True,
    min_confidence=0.6,
    max_topics=5,
    max_intents=3
)

# Detect only topics
topics = ci.detect_topics_only(
    messages=messages,
    min_confidence=0.6,
    max_topics=5
)

# Detect only intents
intents = ci.detect_intents_only(
    messages=messages,
    min_confidence=0.6,
    max_intents=3
)

# Comprehensive conversation analysis
analysis = ci.analyze_conversation(
    messages=messages,
    include_followup=True,
    include_actions=True,
    include_topics=True,
    include_intents=True
)

# Get conversation summary
summary = ci.get_conversation_summary(messages=messages)

Follow-up Question Generation - Generates contextually relevant follow-up questions - Supports conversation context and history - Includes safe fallback mechanisms - Validates question format and content

Action Item Detection - Extracts actionable tasks from conversations - Identifies assignees, due dates, and priorities - Provides confidence scoring - Supports context extraction

Topic and Intent Detection - Identifies main conversation topics - Classifies user intents and purposes - Supports confidence thresholds - Provides keyword extraction and categorization

Comprehensive Analysis - Combines all intelligence features - Provides conversation summaries - Offers flexible configuration options - Returns structured, validated results

Return Types#

The ConversationalIntelligence component returns structured objects:

ActionItem Contains task, assignee, due_date, priority, context, and source_message.
DetectedTopic Contains name, confidence, keywords, category, and context.
DetectedIntent Contains intent_type, confidence, entities, intent_classification, context, and source_message.
TopicIntentResponse Contains lists of detected topics and intents.