Analyses¶
Analyses are functions that run deterministically with code instead of LLM generation.
Like tools or hooks, analyses give you control over specific calculations. Use them when you need reliable, repeatable results—financial metrics, scientific formulas, or statistical tests that must be calculated exactly the same way every time.
Quick example¶
import lumen.ai as lmai
import pandas as pd
import panel as pn
class SummaryStats(lmai.Analysis):
"""Calculate summary statistics."""
columns = ["value"] # (1)!
def __call__(self, pipeline, *args, **kwargs):
df = pipeline.data
stats = df.describe()
return pn.widgets.Tabulator(stats)
# Pass analyses directly to ExplorerUI
ui = lmai.ExplorerUI(
data='data.csv',
analyses=[SummaryStats]
)
ui.servable()
- Analysis only runs when data has a "value" column
When the data has a "value" column, a suggestion button appears for the analysis. Users can click it or ask "Show me summary statistics" to run it.
When to use analyses¶
Use analyses for:
- Domain-specific calculations (wind speed from components, ROI formulas)
- Statistical tests (t-tests, ANOVA, correlation)
- Custom visualizations (domain-specific charts)
- Repeated calculations across datasets
Don't use analyses for:
- One-off queries (just ask in chat)
- Simple aggregations (SQL handles these)
- Generic operations (built-in agents cover common tasks)
Basic structure¶
import param
import lumen.ai as lmai
class MyAnalysis(lmai.Analysis):
"""What this analysis does."""
autorun = param.Boolean(default=True) # (1)!
columns = param.List(default=["col1", "col2"]) # (2)!
def __call__(self, pipeline, *args, **kwargs):
# Your calculation logic
df = pipeline.data
result = df["col1"] * df["col2"]
# Return any Panel viewable
return result.hvplot(kind="line")
- Whether to run automatically when conditions are met
- Required columns - analysis only applies when these exist
Complete example: Wind analysis¶
- Deterministic calculation - same inputs always give same outputs
- Only runs when data has time, u, and v columns
Now users can ask "Analyze wind patterns" and get both charts automatically.
Real-world example: Weather data explorer¶
For a complete real-world example, see the Weather Data AI Explorer tutorial. It demonstrates:
- Building a
SkewTAnalysisclass that creates atmospheric Skew-T diagrams - Using template overrides to add meteorological domain knowledge
- Combining analyses with custom agents for specialized applications
- Creating configurable parameters for analysis customization
The tutorial walks through building a domain-specific data exploration application from start to finish.
Control when analyses run¶
class ExpensiveAnalysis(lmai.Analysis):
"""Complex calculation."""
columns = param.List(default=["data"])
autorun = param.Boolean(default=False) # (1)!
def __call__(self, pipeline, *args, **kwargs):
# Expensive computation
return result
- Only runs when user explicitly requests it
Set autorun = False for analyses that take a long time or should only run on explicit request.
Add configurable parameters¶
import param
class ThresholdAnalysis(lmai.Analysis):
"""Filter data by threshold."""
columns = param.List(default=["value"])
threshold = param.Number(default=50) # (1)!
def __call__(self, pipeline, *args, **kwargs):
df = pipeline.data
filtered = df[df["value"] > self.threshold]
return filtered.hvplot(y="value", kind="hist")
- User can adjust the threshold value through UI controls
Multiple analyses¶
ui = lmai.ExplorerUI(
data='data.csv',
analyses=[
WindAnalysis,
TemperatureAnalysis,
PressureAnalysis,
]
)
ui.servable()
Lumen automatically picks the right analysis based on available columns and the user's question.
Common patterns¶
class CorrelationAnalysis(lmai.Analysis):
"""Show correlation between variables."""
columns = param.List(default=["x", "y"])
def __call__(self, pipeline, *args, **kwargs):
df = pipeline.data
correlation = df["x"].corr(df["y"])
plot = df.hvplot.scatter(
x="x", y="y",
title=f"Correlation: {correlation:.2f}"
)
return plot
class ROIAnalysis(lmai.Analysis):
"""Calculate return on investment."""
columns = param.List(default=["revenue", "cost"])
def __call__(self, pipeline, *args, **kwargs):
df = pipeline.data.copy()
df["roi"] = ((df["revenue"] - df["cost"]) / df["cost"]) * 100
return df.hvplot.hist(y="roi", title="ROI Distribution")
import pandas as pd
import panel as pn
class QualityCheck(lmai.Analysis):
"""Check for missing and duplicate data."""
columns = param.List(default=[]) # Works with any columns
def __call__(self, pipeline, *args, **kwargs):
df = pipeline.data
report = pd.DataFrame([{
"total_rows": len(df),
"missing_values": df.isnull().sum().sum(),
"duplicates": df.duplicated().sum(),
}])
return pn.widgets.Tabulator(report)
Return types¶
Analyses can return any Panel viewable:
def __call__(self, pipeline, *args, **kwargs):
df = pipeline.data
# HoloViews/hvPlot charts
return df.hvplot(x="time", y="value")
# Panel widgets
return pn.widgets.Tabulator(df)
# Matplotlib figures
fig, ax = plt.subplots()
ax.plot(df["x"], df["y"])
return pn.pane.Matplotlib(fig)
# Multiple components
return pn.Column(chart, table, summary)
Troubleshooting¶
Analysis never runs¶
Check that required columns exist in your data with exact names (case-sensitive).
def __call__(self, pipeline, *args, **kwargs):
print("Available columns:", pipeline.data.columns.tolist())
# Your logic
Analysis fails silently¶
Add error handling:
def __call__(self, pipeline, *args, **kwargs):
try:
df = pipeline.data
# Your logic
return result
except Exception as e:
return pn.pane.Alert(f"Analysis failed: {e}", alert_type="danger")
Wrong columns¶
Column names are case-sensitive. Check your data:
# If data has "Value" (capital V)
columns = param.List(default=["value"]) # ❌ Won't match
columns = param.List(default=["Value"]) # ✅ Matches
Best practices¶
Keep analyses focused. One analysis should perform one calculation or produce one type of output.
Declare required columns. Use param.List(default=[...]) to specify what columns your analysis needs.
Use param for configurability. Add param.Number, param.Integer, etc. for values users might want to adjust.
Return Panel viewables. Use pn.widgets.Tabulator, hvplot, pn.pane.Matplotlib, or pn.Column to combine multiple outputs.
Handle missing data. Check for NaN values and handle them appropriately in your calculations.
Test with real data. Different datasets may have different distributions or edge cases.
Document what it does. Write clear docstrings explaining the calculation and when to use it.
Working with global context¶
You can add domain knowledge that all agents see by setting template overrides on the base Actor class:
global_context = """
{{ super() }}
Domain knowledge:
- Inversions occur when temperature increases with altitude
- Standard lapse rate is 6.5°C per km
"""
lmai.actor.Actor.template_overrides = {"main": {"global": global_context}}
ui = lmai.ExplorerUI(
data='weather.csv',
analyses=[MyAnalysis]
)
ui.servable()
This adds the context to all agents' prompts, helping them understand your domain.
Per-agent customization with analyses¶
Combine analyses with agent-specific template overrides:
This creates a specialized meteorology assistant with domain-specific analyses and terminology.