LLM Providers¶
Configure which AI model powers Lumen.
Quick start¶
Set your API key and launch:
Lumen auto-detects the provider from environment variables.
Different models per agent¶
Use cheap models for simple tasks, powerful models for complex tasks:
import lumen.ai as lmai
model_config = {
"default": {"model": "gpt-5-mini"}, # Cheap for most agents
"sql": {"model": "gpt-5.2"}, # Powerful for SQL
"vega_lite": {"model": "gpt-5.2"}, # Powerful for charts
"analyst": {"model": "gpt-5.2"}, # Powerful for analysis
}
llm = lmai.llm.OpenAI(model_kwargs=model_config)
ui = lmai.ExplorerUI(data='penguins.csv', llm=llm)
ui.servable()
Agent names map to model types: SQLAgent → "sql", VegaLiteAgent → "vega_lite", etc.
Configure temperature¶
Lower temperature = more deterministic. Higher = more creative.
model_config = {
"sql": {
"model": "gpt-5.2",
"temperature": 0.1, # Deterministic SQL
},
"chat": {
"model": "gpt-5-mini",
"temperature": 0.4, # Natural conversation
},
}
Recommended ranges: 0.1 (SQL) to 0.4 (chat).
Provider setup¶
OpenAI¶
Prerequisites:
- Lumen AI installed in your Python environment
- An OpenAI API Key from the OpenAI Dashboard
Default models:
default:gpt-4.1-minisql:gpt-4.1-minivega_lite:gpt-4.1-miniedit:gpt-4.1-miniui:gpt-4.1-nano
Popular models:
gpt-5.2- Best model for coding and agentic tasksgpt-5-mini- Faster, cost-efficient for well-defined tasksgpt-5-nano- Fastest, most cost-efficientgpt-4.1- Smartest non-reasoning model
Environment variables:
OPENAI_API_KEY: Your OpenAI API key (required)OPENAI_ORGANIZATION: Your OpenAI organization ID (optional)
Anthropic¶
Prerequisites:
- Lumen AI installed in your Python environment
- An Anthropic API Key from the Anthropic Console
Default models:
default:claude-haiku-4-5edit:claude-sonnet-4-5
Popular models:
claude-sonnet-4-5- Smart model for complex agents and codingclaude-haiku-4-5- Fastest with near-frontier intelligenceclaude-opus-4-5- Premium model with maximum intelligence
Environment variables:
ANTHROPIC_API_KEY: Your Anthropic API key (required)
Google Gemini¶
Prerequisites:
- Lumen AI installed in your Python environment
- A Google AI API Key from Google AI Studio
Default models:
default:gemini-2.5-flash(best price-performance with thinking)edit:gemini-2.5-pro(state-of-the-art thinking model)
Environment variables:
GEMINI_API_KEY: Your Google AI API key (required)
Popular models:
gemini-2.5-pro- State-of-the-art thinking modelgemini-2.5-flash- Best price-performance with thinkinggemini-2.5-flash-lite- Lightweight with thinking capabilitiesgemini-2.0-flash- Latest general-purpose model
Mistral¶
Prerequisites:
- Lumen AI installed in your Python environment
- A Mistral API Key from the Mistral Dashboard
Default models:
default:mistral-small-latestedit:mistral-medium-latest
Environment variables:
MISTRAL_API_KEY: Your Mistral API key (required)
Popular models:
mistral-small-latest- Cost-effective for general tasksmistral-large-latest- Advanced reasoning and complex queriesministral-8b-latest- Lightweight edge model
Azure OpenAI¶
Prerequisites:
- Lumen AI installed in your Python environment
- An Azure OpenAI Service resource from Azure Portal
- Azure Inference API Key and Endpoint URL from your Azure OpenAI resource
Environment variables:
AZUREAI_ENDPOINT_KEY: Your Azure API key (required)AZUREAI_ENDPOINT_URL: Your Azure endpoint URL (required)
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
token_provider = get_bearer_token_provider(
DefaultAzureCredential(),
"https://cognitiveservices.azure.com/.default"
)
llm = lmai.llm.AzureOpenAI(
api_version="...",
endpoint="...",
model_kwargs={
"default": {
"model": "gpt4o-mini",
"azure_ad_token_provider": token_provider
}
}
)
ui = lmai.ExplorerUI(data='penguins.csv', llm=llm)
ui.servable()
Ollama (local)¶
Prerequisites:
- Lumen AI installed in your Python environment
- Ollama installed from ollama.com
- At least one model pulled locally
Default models:
default:qwen3:8b
Setup Ollama:
Download and run the installer from ollama.com, then:
Recommended models:
| Use Case | Model | Notes |
|---|---|---|
| General purpose | qwen3:8b |
Default - comprehensive capabilities |
llama3.3:70b |
State of the art 70B performance | |
gemma3:12b |
Google's efficient model | |
| Coding | qwen3-coder:30b |
Specialized for code generation |
qwen2.5-coder:7b |
Smaller coding model | |
| Lightweight | gemma3:12b |
High-performing, efficient |
phi4:14b |
Microsoft's lightweight SOTA | |
| Reasoning | deepseek-r1:7b |
Advanced reasoning model |
| Latest | llama4:latest |
Cutting edge model |
Run Lumen:
Model String Formats
- OpenAI:
"gpt-5.2","gpt-5-mini","gpt-5-nano","gpt-4.1" - Anthropic:
"claude-sonnet-4-5","claude-haiku-4-5","claude-opus-4-5" - Google:
"gemini/gemini-2.0-flash","gemini/gemini-2.0-flash-thinking-exp" - Azure:
"azure/your-deployment-name" - Cohere:
"command-r-plus"
See LiteLLM providers for complete list.
Llama.cpp (local)¶
Prerequisites:
- Lumen AI installed in your Python environment
- Llama.cpp installed - Installation Guide
- Decent hardware (modern GPU, ARM Mac, or high-core CPU)
Default models:
default:Qwen/Qwen2.5-Coder-7B-Instruct-GGUFreasoning: Uses default if not specified
First Run Downloads Model
The first time you use Llama.cpp, it will download the specified model, which may take some time depending on model size and your internet connection.
Run Lumen:
# DeepSeek R1 with custom settings
llm = lmai.llm.LlamaCpp(
model_kwargs={
"default": {
"repo_id": "bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF",
"filename": "DeepSeek-R1-Distill-Qwen-32B-Q4_K_M.gguf",
"chat_format": "qwen",
"n_ctx": 131072
}
}
)
ui = lmai.ExplorerUI(data='penguins.csv', llm=llm)
ui.servable()
For Larger Models
For working with larger models, consider using the Llama.cpp server with OpenAI-compatible endpoints for better performance.
LiteLLM (multi-provider)¶
Prerequisites:
- Lumen AI installed in your Python environment
litellmpackage:pip install litellm- API keys for providers you want to use
Default models:
default:gpt-4.1-mini(cost-effective for general tasks)edit:anthropic/claude-sonnet-4-5(advanced reasoning)sql:gpt-4.1-mini(SQL query generation)
Environment variables:
Set environment variables for any provider you want to use:
OPENAI_API_KEY- For OpenAI modelsANTHROPIC_API_KEY- For Anthropic modelsGEMINI_API_KEY- For Google modelsAZURE_API_KEY+AZURE_API_BASE- For AzureAWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEY- For AWS BedrockCOHERE_API_KEY- For Cohere models
Route between providers:
# Mix different providers for different tasks
llm = lmai.llm.LiteLLM(
model_kwargs={
"default": {"model": "gpt-4.1-mini"}, # OpenAI
"edit": {"model": "anthropic/claude-sonnet-4-5"}, # Anthropic
"sql": {"model": "gpt-4.1-mini"} # OpenAI
}
)
ui = lmai.ExplorerUI(data='penguins.csv', llm=llm)
ui.servable()
Supported providers:
OpenAI • Anthropic • Google Gemini • Azure • AWS Bedrock • Cohere • Hugging Face • Vertex AI • And 100+ more
Model String Formats
- OpenAI:
"gpt-4.1-mini","gpt-4.1-nano","gpt-5-mini","gpt-4o" - Anthropic:
"anthropic/claude-sonnet-4-5","anthropic/claude-haiku-4-5","anthropic/claude-opus-4-1" - Google:
"gemini/gemini-2.5-flash","gemini/gemini-2.5-pro" - Mistral:
"mistral/mistral-medium-latest","mistral/mistral-small-latest" - Azure:
"azure/your-deployment-name" - Cohere:
"command-r-plus"
See LiteLLM providers for complete list.
OpenAI-compatible endpoints¶
Prerequisites:
- Lumen AI installed in your Python environment
- Server endpoint URL of your OpenAI API-compliant server
- API key (if required by your endpoint)
Environment variables:
OPENAI_API_BASE_URL: Your custom endpoint URLOPENAI_API_KEY: Your API key (if required)
Local Llama.cpp Server
For a local OpenAI-compatible server, see the Llama.cpp server documentation.
Model types¶
Agent class names convert to model types automatically:
| Agent | Model type |
|---|---|
| SQLAgent | sql |
| VegaLiteAgent | vega_lite |
| ChatAgent | chat |
| AnalystAgent | analyst |
| AnalysisAgent | analysis |
| (others) | default |
Conversion rule: remove "Agent" suffix, convert to snake_case.
Additional model types:
edit- Used when fixing errorsui- Used for UI initialization
Troubleshooting¶
"API key not found" - Set environment variable or pass api_key= in Python.
Wrong model used - Model type names must be snake_case: "sql" not "SQLAgent".
High costs - Use gpt-5-mini or claude-haiku-4-5 for default, reserve gpt-5.2 or claude-sonnet-4-5 for critical tasks (sql, vega_lite, analyst).
Slow responses - Local models are slower than cloud APIs. Use cloud providers when speed matters.
Best practices¶
Use powerful models for critical tasks:
sql- SQL generation needs strong reasoningvega_lite- Visualizations need design understandinganalyst- Analysis needs statistical knowledge
Use efficient models elsewhere:
default- Simple tasks work well withgpt-5-miniorclaude-haiku-4-5chat- Conversation works with smaller models
Set temperature by task:
- 0.1 for SQL (deterministic)
- 0.3-0.4 for analysis and chat
- 0.5-0.7 for creative tasks
Test before deploying
- Different models behave differently. Test with real queries.