Cohere Reranker

Overview

Cohere Reranker uses Cohere’s specialized models for high-quality document reranking. It provides state-of-the-art relevance scoring.

Setup

Install Cohere

Cohere is included with Mini RAG:

uv add mini-rag

Get API Key

Sign up at Cohere
Get your API key from the dashboard
Add to your environment:

COHERE_API_KEY=your-cohere-api-key

Configuration

Basic Usage

from mini import AgenticRAG, RerankerConfig
import os

rag = AgenticRAG(
    vector_store=vector_store,
    embedding_model=embedding_model,
    reranker_config=RerankerConfig(
        type="cohere",
        kwargs={
            "api_key": os.getenv("COHERE_API_KEY")
        }
    )
)

Complete Configuration

reranker_config = RerankerConfig(
    type="cohere",
    kwargs={
        "api_key": os.getenv("COHERE_API_KEY"),
        "model": "rerank-english-v3.0",
        "max_chunks_per_doc": None
    }
)

Available Models

rerank-english-v3.0

Best for: English text
Quality: Highest
Speed: Fast
Recommended for: Production English applications

rerank-multilingual-v3.0

Best for: Multiple languages
Quality: High
Speed: Fast
Recommended for: International applicationsSupports: English, French, Spanish, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and more

rerank-english-v2.0

Best for: Legacy applications
Quality: Good
Speed: Fast
Recommended for: Existing integrations

Direct Usage

Use the Cohere reranker directly:

from mini.reranker import CohereReranker

# Initialize
reranker = CohereReranker(
    api_key=os.getenv("COHERE_API_KEY"),
    model="rerank-english-v3.0"
)

# Rerank documents
query = "What is machine learning?"
documents = [
    "Machine learning is a subset of AI...",
    "Python is a programming language...",
    "Deep learning uses neural networks..."
]

results = reranker.rerank(query, documents, top_k=2)

for result in results:
    print(f"Score: {result.score:.3f}")
    print(f"Document: {result.document[:100]}...")

Complete Example

import os
from mini import (
    AgenticRAG,
    LLMConfig,
    RetrievalConfig,
    RerankerConfig,
    EmbeddingModel,
    VectorStore
)
from dotenv import load_dotenv

load_dotenv()

# Initialize RAG with Cohere reranking
rag = AgenticRAG(
    vector_store=VectorStore(
        uri=os.getenv("MILVUS_URI"),
        token=os.getenv("MILVUS_TOKEN"),
        collection_name="documents",
        dimension=1536
    ),
    embedding_model=EmbeddingModel(),
    llm_config=LLMConfig(model="gpt-4o-mini"),
    retrieval_config=RetrievalConfig(
        top_k=10,
        rerank_top_k=3,
        use_reranking=True
    ),
    reranker_config=RerankerConfig(
        type="cohere",
        kwargs={
            "api_key": os.getenv("COHERE_API_KEY"),
            "model": "rerank-english-v3.0"
        }
    )
)

# Index and query
rag.index_document("document.pdf")
response = rag.query("What is the main topic?")

print(response.answer)

Pricing

Cohere reranking pricing (as of 2024):

Search Units: Charged per 1000 search units
Search Unit: 1 query + 1 document to rerank
Example: 1 query with 10 documents = 10 search units

Typical costs:

Free tier: Limited searches
Paid: ~$1-2 per 1000 queries (10 docs each)

Check Cohere Pricing for current rates.

Best Practices

Choose the Right Model

English only: Use rerank-english-v3.0
Multiple languages: Use rerank-multilingual-v3.0
Best quality: Always use v3.0 models

Optimize top_k

Retrieve more initially, rerank to fewer:

RetrievalConfig(
    top_k=15,        # Cast wide net
    rerank_top_k=3   # Keep only best
)

Handle Errors

try:
    response = rag.query(question)
except Exception as e:
    if "rate_limit" in str(e).lower():
        # Handle rate limit
        time.sleep(1)
        response = rag.query(question)
    else:
        raise

Monitor Usage

Track API usage in Cohere dashboard to manage costs

Advantages

✅ Highest Quality: Best-in-class reranking
✅ Fast: Low latency (~50-100ms)
✅ Easy Setup: Simple API integration
✅ Multilingual: Supports many languages
✅ Maintained: Continuously improved by Cohere

Limitations

❌ Cost: Requires API subscription
❌ Cloud Only: Not available for local deployment
❌ API Dependency: Requires internet connection
❌ Rate Limits: Subject to API rate limits

Troubleshooting

API Key Error

Ensure your API key is set:

import os
print(os.getenv("COHERE_API_KEY"))

Rate Limit

Implement backoff:

from time import sleep

for attempt in range(3):
    try:
        results = reranker.rerank(query, docs)
        break
    except Exception as e:
        if attempt < 2:
            sleep(2 ** attempt)

Model Not Found

Use correct model names:

rerank-english-v3.0 ✅
rerank-multilingual-v3.0 ✅
rerank-v3 ❌ (incorrect)

Rerankers Overview

Compare rerankers

Reranking Feature

Learn about reranking

Cohere Docs

Official Cohere docs

API Documentation

Core Classes

Configuration

Rerankers

Overview

Setup

Install Cohere

Get API Key

Configuration

Basic Usage

Complete Configuration

Available Models

Direct Usage

Complete Example

Pricing

Best Practices

Advantages

Limitations

Troubleshooting

See Also

Rerankers Overview

Reranking Feature

Cohere Docs

API Documentation

Core Classes

Configuration

Rerankers

​Overview

​Setup

​Install Cohere

​Get API Key

​Configuration

​Basic Usage

​Complete Configuration

​Available Models

​Direct Usage

​Complete Example

​Pricing

​Best Practices

​Advantages

​Limitations

​Troubleshooting

​See Also

Rerankers Overview

Reranking Feature

Cohere Docs

Overview

Setup

Install Cohere

Get API Key

Configuration

Basic Usage

Complete Configuration

Available Models

Direct Usage

Complete Example

Pricing

Best Practices

Advantages

Limitations

Troubleshooting

See Also