Overview
Query rewriting is an agentic feature that automatically generates multiple variations of your query to improve retrieval coverage. Different phrasings can retrieve different relevant chunks, leading to more comprehensive answers.How It Works
When enabled, Mini RAG:- Takes your original query
- Generates 2-3 variations using an LLM
- Embeds all queries (original + variations)
- Searches with each embedding
- Combines and deduplicates results
- Re-ranks if enabled
Quick Start
When to Use
Use Query Rewriting When
- Queries may be ambiguous
- Multiple phrasings possible
- Need comprehensive coverage
- Domain has varied terminology
- Users ask questions naturally
Skip Query Rewriting When
- Queries are very specific
- Speed is critical
- Cost is a concern
- Simple factual lookups
- Technical exact matches
Configuration
Enable/Disable
With Other Features
Examples
Example 1: Natural Questions
Example 2: Technical Terms
Example 3: Ambiguous Queries
Benefits
Improved Recall
Improved Recall
Find more relevant content:Different phrasings retrieve different chunks, improving coverage:
- Original: Finds chunks with “budget”
- Variation 1: Finds chunks with “funding allocation”
- Variation 2: Finds chunks with “financial resources”
- Combined: More comprehensive results
Handle Ambiguity
Handle Ambiguity
Clarify vague queries:Each variation can target different aspects of the query.
Domain Adaptation
Domain Adaptation
Match domain terminology:The LLM can rephrase queries using domain-specific terms:
Natural Language
Natural Language
Handle conversational queries:
Performance Impact
Speed
| Configuration | Time | Impact |
|---|---|---|
| No rewriting | 100ms | Baseline |
| With rewriting | 250ms | +150ms (2-3 extra queries) |
- LLM query rewriting: ~50ms
- Additional embeddings: ~30ms
- Extra searches: ~70ms per variation
Cost
Query rewriting uses your LLM to generate variations:Quality
Typical improvements with query rewriting:- Recall: +15-30% (finds more relevant chunks)
- Answer Quality: +10-20% (more comprehensive context)
- Edge Cases: +30-50% (handles ambiguous queries better)
Best Practices
Balance Speed vs Quality
Balance Speed vs Quality
Choose based on use case:
Combine with Reranking
Combine with Reranking
Optimal pipeline:
- Query rewriting generates variations
- Each variation retrieves chunks
- Combine and deduplicate results
- Rerank for best quality
Monitor Variations
Monitor Variations
Check what’s being generated:
A/B Testing
A/B Testing
Compare with and without:
Comparison
Without Query Rewriting
With Query Rewriting
Advanced Usage
Inspect Query Variations
Custom Query Preprocessing
Combine with Metadata Filtering
Troubleshooting
Poor quality variations
Poor quality variations
Solution: The variations are generated by your LLM. Try:
- Use a better LLM model
- Provide more context in queries
- Check LLM configuration
Too slow
Too slow
Solution: Disable query rewriting or optimize:
High costs
High costs
Solution: Query rewriting adds LLM calls. To reduce cost:
- Disable for simple queries
- Use cheaper LLM model
- Cache common queries
When to Disable
Consider disabling query rewriting when:- Simple factual queries: “What is X?” where X is specific
- Speed is critical: Real-time systems with tight latency requirements
- Cost constraints: High query volume with budget limits
- Technical queries: Queries with specific technical terms
- Exact match needs: Looking for specific keywords or phrases
