Overview
LLM Reranker uses your configured language model to score and rerank retrieved chunks. It’s simple to set up and doesn’t require additional APIs.Configuration
Basic Usage
How It Works
The LLM reranker:- Receives the query and retrieved chunks
- Asks the LLM to score each chunk’s relevance (0-10)
- Reranks chunks by score
- Returns the top chunks
Prompt Example
Direct Usage
Complete Example
Configuration Options
The LLM reranker can be configured throughLLMConfig:
Performance
Speed
- Fast LLMs (gpt-3.5-turbo, gpt-4o-mini): 500-1000ms for 10 chunks
- Slower LLMs (gpt-4): 1000-2000ms for 10 chunks
- Local LLMs: Varies widely (500-5000ms)
Cost
Depends on your LLM pricing and number of chunks:Quality
- GPT-4: Excellent (comparable to Cohere)
- GPT-4o-mini: Very Good
- GPT-3.5-turbo: Good
- Local models: Varies
Best Practices
Use Lower Temperature
Use Lower Temperature
Consistent scoring requires lower temperature:
Fast LLMs for Reranking
Fast LLMs for Reranking
Use faster models for reranking:
Chunk Truncation
Chunk Truncation
Long chunks are automatically truncated to save tokens:
Balance top_k
Balance top_k
More chunks = more LLM tokens:
Advantages
✅ Simple Setup: No additional APIs needed✅ Uses Existing LLM: Leverages your configured model
✅ Good Quality: Especially with GPT-4/4o
✅ Flexible: Works with any OpenAI-compatible API
Limitations
❌ Token Cost: Uses LLM tokens for each reranking❌ Latency: Slower than specialized rerankers
❌ Consistency: Scoring can vary between runs
❌ Not Optimized: General LLM vs specialized reranker
When to Use
Use LLM reranker when:- ✅ You’re already using a good LLM (GPT-4, GPT-4o-mini)
- ✅ You want simple setup with no extra APIs
- ✅ You don’t need the absolute fastest reranking
- ✅ Token cost is acceptable
- ❌ You need maximum quality → Use Cohere
- ❌ You need maximum speed → Use Sentence Transformer with GPU
- ❌ You want to minimize LLM costs → Use Sentence Transformer
- ❌ You need local/private → Use Sentence Transformer
Comparison with Other Rerankers
| Feature | LLM | Cohere | Sentence Transformer |
|---|---|---|---|
| Quality | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ |
| Speed | ⚡⚡ | ⚡⚡⚡ | ⚡⚡⚡⚡ |
| Setup | ✅ Easy | ✅ Easy | ⚠️ Moderate |
| Cost | 💰💰 LLM tokens | 💰💰 API | 💰 Free |
| Privacy | ☁️ Cloud | ☁️ Cloud | 🔒 Local |
Troubleshooting
Inconsistent Scores
Inconsistent Scores
Lower the temperature:
Too Slow
Too Slow
Use a faster model:Or reduce chunks:
High Costs
High Costs
Consider alternatives:
- Sentence Transformer (local, free)
- Reduce
top_kto rerank fewer chunks - Use LLM reranking selectively
