Elsai Utilities#
The Elsai Utilities package provides helper classes for chunking and converting documents for use in retrieval-augmented generation (RAG) and vector database ingestion pipelines.
Prerequisites#
Python >= 3.9
Installation#
To install the elsai-utilities package:
pip install --index-url https://elsai-core-package.optisolbusiness.com/root/elsai-utilities/ elsai-utilities==0.1.0
Components#
The DocumentChunker class provides various ways to split text into structured chunks.
from elsai_utilities.splitters import DocumentChunker chunker = DocumentChunker() contents = "# This is the first page.\n\n## This is the second page.\n\n### This is the third page." # Page-wise chunking chunks = chunker.chunk_page_wise(contents=contents, file_name="example.md") # Markdown header-wise chunking markdown_wise_chunks = chunker.chunk_markdown_header_wise( text=contents, file_name="example.md", headers_to_split_on=[("#", "Header 1"), ("##", "Header 2")], strip_headers=True ) # Recursive character-wise chunking text = "This is a long piece of text that should be chunked recursively..." recursive_chunks = chunker.chunk_recursive( contents=text, file_name="example.md", chunk_size=50, chunk_overlap=10 )
The DocumentConverter class converts LlamaIndex documents into LangChain-compatible Document objects.
from elsai_utilities.converters import DocumentConverter converter = DocumentConverter() llama_index_document = { "text_resource": { "text": "This is a sample text extracted from LlamaIndex." } } langchain_document = converter.llama_index_to_langchain_document( llama_index_document=llama_index_document, file_name="example.md" )
The ConversationalIntelligence class provides comprehensive conversational analysis capabilities including follow-up question generation, action item detection, and topic/intent classification.
from elsai_utilities.conversational_intelligence import ConversationalIntelligence # Initialize with your LLM instance (e.g., ChatOpenAI, Claude, etc.) ci = ConversationalIntelligence(llm=your_llm_instance) # Generate follow-up questions followup_questions = ci.generate_followup_questions( user_question="What is machine learning?", answer="Machine learning is a subset of AI that enables computers to learn from data.", context=["Previous discussion about AI", "User is a beginner"], num_questions=3 ) # Safe version with fallback questions safe_questions = ci.generate_followup_questions_safe( user_question="What is machine learning?", answer="Machine learning is a subset of AI...", num_questions=3, fallback_questions=["Can you tell me more?", "What are some examples?"] ) # Detect action items from conversation messages = [ "John, can you prepare the quarterly report by Friday?", "Sure, I'll have it ready. Should I include the budget analysis?", "Yes, and make sure to highlight the key metrics." ] action_items = ci.detect_action_items( messages=messages, include_context=True, extract_priority=True, extract_assignee=True, extract_due_date=True, min_confidence=0.7 ) # Detect topics and intents topic_intent_result = ci.detect_topics_and_intents( messages=messages, detect_topics=True, detect_intents=True, min_confidence=0.6, max_topics=5, max_intents=3 ) # Detect only topics topics = ci.detect_topics_only( messages=messages, min_confidence=0.6, max_topics=5 ) # Detect only intents intents = ci.detect_intents_only( messages=messages, min_confidence=0.6, max_intents=3 ) # Comprehensive conversation analysis analysis = ci.analyze_conversation( messages=messages, include_followup=True, include_actions=True, include_topics=True, include_intents=True ) # Get conversation summary summary = ci.get_conversation_summary(messages=messages)
Follow-up Question Generation - Generates contextually relevant follow-up questions - Supports conversation context and history - Includes safe fallback mechanisms - Validates question format and content
Action Item Detection - Extracts actionable tasks from conversations - Identifies assignees, due dates, and priorities - Provides confidence scoring - Supports context extraction
Topic and Intent Detection - Identifies main conversation topics - Classifies user intents and purposes - Supports confidence thresholds - Provides keyword extraction and categorization
Comprehensive Analysis - Combines all intelligence features - Provides conversation summaries - Offers flexible configuration options - Returns structured, validated results
Return Types#
The ConversationalIntelligence component returns structured objects:
ActionItem Contains task, assignee, due_date, priority, context, and source_message.
DetectedTopic Contains name, confidence, keywords, category, and context.
DetectedIntent Contains intent_type, confidence, entities, intent_classification, context, and source_message.
TopicIntentResponse Contains lists of detected topics and intents.