ColPali
Experimental10 creditsColPali combines vision and language models for searching documents where visual context matters. Ideal for documents with charts, diagrams, tables, and complex formatting.
Production Recommendation
This is a direct endpoint for development and testing. For production workloads, use the Data Intelligence Pipeline -- it provides structured Data Packages with quality metrics, is async by default, and is covered by Enterprise SLAs.
Overview
ColPali combines vision and language models for searching documents where visual context matters. Ideal for documents with charts, diagrams, tables, and complex formatting.
Key features:
- •Vision-language embeddings (image + text)
- •Document page understanding
- •Layout-aware retrieval
- •Works with queries and document images
API Reference
https://api.latence.ai/api/v1/colpali/embedRequest Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
text | string | — | Query text (for is_query=true) | |
image | string | — | Base64-encoded image data | |
is_query | boolean | — | True for text queries, false for images |
Response Fields
| Field | Type | Description |
|---|
Response Example
{
"embeddings": [[...], [...], ...],
"shape": [196, 128],
"encoding_format": "float",
"success": true,
"usage": { "credits": 1.0 }
}Code Examples
from latence import Latence
client = Latence(api_key="YOUR_API_KEY")
# Text query embedding
result = client.experimental.colpali.embed(
text="Find invoices from 2024",
is_query=True
)
# Or embed a document image from file
result = client.experimental.colpali.embed(
image_path="/path/to/document_page.png",
is_query=False # For indexing documents
)
print(result.embeddings) # Float arrays
print(result.shape) # [patches, 128]Explore Tutorials & Notebooks
Deep-dive examples and interactive notebooks in our GitHub repository
Looking for production-grade processing?
The Data Intelligence Pipeline chains services automatically and returns structured Data Packages.