Embeddings¶
Convert text into numerical vectors for semantic search.
Quick start¶
By default, Lumen uses simple Numpy-based embeddings:
import lumen.ai as lmai
ui = lmai.ExplorerUI(data='penguins.csv')
ui.servable()
For better semantic search, use OpenAI:
import lumen.ai as lmai
from lumen.ai.embeddings import OpenAIEmbeddings
from lumen.ai.vector_store import DuckDBVectorStore
vector_store = DuckDBVectorStore(embeddings=OpenAIEmbeddings())
ui = lmai.ExplorerUI(data='penguins.csv', vector_store=vector_store)
ui.servable()
See Vector Stores for how to use embeddings with storage backends.
Providers¶
OpenAI¶
from lumen.ai.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small") # Fast
# embeddings = OpenAIEmbeddings(model="text-embedding-3-large") # Higher quality
Setup:
See LLM Providers for more on OpenAI configuration.
Azure OpenAI¶
from lumen.ai.embeddings import AzureOpenAIEmbeddings
embeddings = AzureOpenAIEmbeddings(
api_key='...',
endpoint='https://your-resource.openai.azure.com/'
)
See LLM Providers - Azure OpenAI for authentication details.
HuggingFace¶
Run locally with Sentence Transformers:
from lumen.ai.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(
model="ibm-granite/granite-embedding-107m-multilingual",
device="cpu" # or "cuda" for GPU
)
Install:
Llama.cpp¶
Run GGUF models locally:
from lumen.ai.embeddings import LlamaCppEmbeddings
embeddings = LlamaCppEmbeddings(
model_kwargs={
"default": {
"repo_id": "Qwen/Qwen3-Embedding-4B-GGUF",
"filename": "Qwen3-Embedding-4B-Q4_K_M.gguf",
}
}
)
See LLM Providers - Llama.cpp for model configuration.
Numpy (default)¶
Hash-based embeddings for prototyping:
- ✅ No API calls, works offline
- ⚠️ Lower quality than neural embeddings
Configuration¶
Chunk size¶
Control how documents are split:
vector_store = DuckDBVectorStore(
embeddings=OpenAIEmbeddings(),
chunk_size=512, # Smaller chunks = more precise
)
Guidelines:
- Small (256-512): Precise answers, higher cost
- Medium (1024): Balanced (default)
- Large (2048): Broader context, lower cost
See Vector Stores - Chunk size for implementation details.
Exclude metadata¶
Prevent fields from being embedded:
vector_store = DuckDBVectorStore(
embeddings=OpenAIEmbeddings(),
excluded_metadata=['file_size', 'upload_date']
)
When embeddings are used¶
Embeddings power three features:
Document search - Queries find semantically similar text:
See Vector Stores - Searching for query examples.
Table discovery - Tools find relevant tables:
from lumen.ai.tools import IterativeTableLookup
tool = IterativeTableLookup(tables=['customers', 'orders', 'products'])
See Tools - Built-in tools for tool configuration.
Contextual augmentation - Chunks get context descriptions:
See Vector Stores - Contextual augmentation for details on situate.
Best practices¶
Match embeddings to data:
- English-only →
sentence-transformers/all-MiniLM-L6-v2 - Multilingual →
ibm-granite/granite-embedding-107m-multilingual - Best quality →
text-embedding-3-large
Optimize chunk size:
- FAQ/short answers → 256-512 tokens
- General documents → 1024 tokens (default)
- Long-form content → 2048 tokens
Use situate selectively:
- Enable for technical docs, books, research papers
- Disable for simple content (FAQs, short articles)
- Requires LLM access (uses additional API calls)
See also¶
- Vector Stores - Storage and retrieval using embeddings
- LLM Providers - Configure API keys and models
- Tools - Built-in tools that use embeddings