What is RAG?
Retrieval-Augmented Generation (RAG) is a technique that enhances Large Language Models (LLMs) by providing them with relevant information from external knowledge sources. Instead of relying solely on the model’s training data, RAG systems:- Retrieve relevant information from a knowledge base
- Augment the user’s query with this context
- Generate an informed response using an LLM
Mini RAG Architecture
Mini RAG follows a modular, pipeline-based architecture that makes it easy to understand, customize, and extend:Core Components
Document Loader
Load and parse documents from multiple formats (PDF, DOCX, images, etc.)
Chunker
Split documents into optimal chunks for embedding and retrieval
Embedding Model
Convert text into vector embeddings for semantic search
Vector Store
Store and search embeddings using Milvus
The RAG Pipeline
1. Indexing Phase
When you index a document, Mini RAG performs the following steps:2. Query Phase
When you query the system, Mini RAG:Modular Design
One of Mini RAG’s strengths is its modularity. You can:Use Individual Components
Mix and Match
Build Custom Pipelines
Configuration-Based API
Mini RAG uses a clean, configuration-based API that organizes settings into logical groups:Benefits
Better Organization
Related settings grouped together logically
Type Safety
Validated with Pydantic dataclasses
Easy Maintenance
Change one config without affecting others
Clear Code
Self-documenting configuration objects
Key Design Principles
Simplicity First
Simplicity First
Mini RAG prioritizes ease of use. Get started with just a few lines of code, then customize as needed.
Production Ready
Production Ready
Built with production use cases in mind: error handling, retries, timeouts, observability, and comprehensive configuration.
Modular & Extensible
Modular & Extensible
Use the full pipeline or individual components. Easy to extend with custom implementations.
Pythonic API
Pythonic API
Clean, intuitive API that follows Python best practices and conventions.
Type Safe
Type Safe
Leverages Pydantic for data validation and type safety throughout the library.
Understanding the Response
When you query Mini RAG, you get a comprehensive response object:Next Steps
Document Loading
Learn how to load documents from various formats
Chunking
Understand text chunking strategies
Embeddings
Explore embedding generation options
Vector Store
Master vector storage and search
