Interactive Pipelines#
The plots built up over the first few tutorials were already highly interactive in the web browser, with interactive tools and events in each plot, and sometimes Bokeh widgets created by HoloViews automatically to select a specific categorical value from a groupby
. However, when you are exploring a dataset, the automatic interactivity may not be enough – you may want to add interactive control to some other aspect of how data is selected or generated or how it is visualized that you decide. Luckily, HoloViz makes it almost trivially easy to add any interactive behavior that you want, making it easy for you or your users to explore any parameter or setting in your code.
Though we haven’t yet needed to discuss it, the Bokeh widgets generated by hvPlot were configured using the Panel library from HoloViz, which focuses on widgets, layouts, and constructing standalone HoloViz applications. In this notebook, we’ll use Panel’s support for the “reactive expressions” from the underlying Param library to add just about any interactivity you can imagine. E.g. if you want to control filtering or selecting or aggregation in your Pandas or GeoPandas or Dask DataFrame, or Xarray Dataset, you simply wrap that object in pn.rx()
. pn.rx()
returns an reactive object (e.g. a reactive Pandas DataFrame) that can be used almost exactly the same as the original object (e.g. calling regular Pandas methods) but whose output (e.g. a DataFrame view) will be re-computed everytime one of the reactive pipeline inputs change. The inputs are typically widgets (e.g. a drop-down list) that replace values you would usually hard-code and manually update to observe how they affect the output. When such an reactive object is displayed in a notebook, it includes the widgets that you have used together with the regular output, and the output is automatically updated appropriately.
Panel widgets#
Before using rx()
we will need some widgets. You can use ipywidgets if you want, but to keep things consistent with hvPlot’s widgets, here we will be using Bokeh widgets from Panel. Let’s first get ahold of a Panel widget to see how they work. Here, let’s create a Panel floating-point number slider to specify an earthquake magnitude between zero and nine:
import pathlib
import holoviews as hv
import hvplot.pandas # noqa
import numpy as np
import pandas as pd
import panel as pn
pn.extension(throttled=True)
mag_slider = pn.widgets.FloatSlider(name='Minimum Magnitude', start=0, end=9, value=6)
mag_slider
The widget is a JavaScript object, but there are bidirectional connections between JS and Python that let us see and change the value of this slider using its value
parameter:
mag_slider.value
6
mag_slider.value = 7
Exercise#
Try moving the slider around and rerunning the mag_slider.value
above to access the current slider value. As you can see, you can easily get the value of any widget to use in subsequent cells, but you’d need to re-run any cell that accesses that value for it to get updated. Getting output to update automatically is where .rx
comes in.
Panel .rx
#
.rx
expressions provide an easy way to connect widgets directly into an expression you want to control.
First, let’s read in our data:
%%time
df = pd.read_parquet(pathlib.Path('../data/earthquakes-projected.parq'))
df.index = df.index.tz_localize(None)
CPU times: user 2.15 s, sys: 284 ms, total: 2.44 s
Wall time: 1.72 s
Now, let’s do a little filtering that we might want to control with such a widget, such as selecting the highest-magnitude events:
WEB_MERCATOR_LIMITS = (-20037508.342789244, 20037508.342789244)
df2 = df[['mag', 'depth', 'latitude', 'longitude', 'place', 'type']][df['northing'] < WEB_MERCATOR_LIMITS[1]]
df2[df2['mag'] > 5].head()
mag | depth | latitude | longitude | place | type | |
---|---|---|---|---|---|---|
time | ||||||
2000-01-31 07:25:59.740 | 5.4 | 33.0 | 38.114 | 88.604 | southern Xinjiang, China | earthquake |
2000-01-29 08:13:10.730 | 5.4 | 60.7 | -8.633 | 111.137 | Java, Indonesia | earthquake |
2000-01-29 02:53:54.890 | 5.1 | 100.0 | 4.857 | 126.259 | Kepulauan Talaud, Indonesia | earthquake |
2000-01-28 22:57:51.700 | 5.6 | 83.4 | -9.691 | 118.764 | Sumbawa region, Indonesia | earthquake |
2000-01-28 22:42:26.250 | 5.5 | 10.0 | -1.347 | 89.083 | South Indian Ocean | earthquake |
What if instead of ‘5’, we want the output above always to reflect the current value of mag_slider
? It’s trivially easy to do that with .rx
, passing in a widget almost anywhere we want in a pipeline:
rdf = pn.rx(df2)
rdf[rdf['mag'] > mag_slider].head()
(Compare the previous cell to the one before it, to see that 5
has been replaced with mag_slider
.) pn.rx
wraps around nearly any Python object (such as the Pandas DataFrame used here) while letting you provide Panel widgets almost anywhere you’d otherwise be using a number, string, list, tuple, or other Python literal or variable. Basically, if Panel has a widget for it, you can use it!
The way it works is that the rx
object no longer immediately calls the methods and operations you apply to it; instead it stores a copy of your pipeline and dynamically replays the pipeline whenever that widget changes. rx
supports just about any output you might want to get out of such a pipeline, such as text or numbers:
rdf[rdf['mag'] > mag_slider].shape
Or Matplotlib plots:
rdf[rdf['mag'] > mag_slider].plot(y='depth', kind='hist', bins=np.linspace(0, 50, 51), figsize=(6, 3))
Each time you drag the widget and release it (thanks to throttled=True
above), hvPlot replays the pipeline and updates the output shown.
Of course, .rx
also supports .hvplot()
, here with a new copy of a widget so that it will be independent of the other cells above:
mag_slider2 = pn.widgets.FloatSlider(name='Minimum magnitude', start=0, end=9, value=6)
rdf[rdf['mag'] > mag_slider2].hvplot(y='depth', kind='hist', bins=np.linspace(0, 50, 51))
You can see that the depth distribution varies dramatically as you vary the minimum magnitude, with the lowest magnitude events apparently only detectable at short depths. There also seems to be some artifact at depth 10, which is the largest bin regardless of the filtering for all but the largest magnitudes.
Date widgets#
A .rx()
pipeline can contain any number of widgets, including any from the Panel reference gallery. For instance, let’s make a widget to specify a date range covering the dates found in this data:
date = pn.widgets.DatetimeRangeSlider(name='Date', start=df.index[0], end=df.index[-1])
date
Now we can access the value of this slider:
date.value
(Timestamp('2000-01-31 23:52:00.619000'),
Timestamp('2018-12-01 00:00:13.284000'))
As this widget is specifying a range, this time the value is returned as a tuple. If you prefer, you can get the components of the tuple directly via the value_start
and value_end
parameters respectively:
f'Start is at {date.value_start} and the end is at {date.value_end}'
'Start is at 2000-01-31 23:52:00.619000 and the end is at 2018-12-01 00:00:13.284000'
Once again, try specifying different ranges with the widgets and rerunning the cell above.
Now let’s use this widget to expand our expression to filter by date as well as magnitude:
mag = pn.widgets.FloatSlider(name='Minimum magnitude', start=0, end=9, value=6)
filtered = rdf[
(rdf['mag'] > mag) &
(rdf.index >= date.param.value_start) &
(rdf.index <= date.param.value_end)]
filtered.head()
You can now use either the magnitude or the date range (or both) to filter the data, and the output will update. Note that here you want to move the start date of the range slider rather than the end; otherwise, you may not see the table change because the earthquakes are displayed in date order.
Exercise#
For readability, seven columns were chosen before displaying the DataFrame
. Have a look at df.columns
and pick a different set of columns for display, and try filtering on one of them.
Functions as inputs#
Quite often the data structure you want to explore in a pipeline, may itself be the outcome of another pipeline. It may for instance be a Pandas Dataframe created by extracting and transforming the output of a database or an API call, or it could be the dynamic output of some simulation or pre-processing. You can wrap function calls with pn.rx
, but Panel provides a more explicit way to bind reactive arguments to functions, called pn.bind
. With bind
you can start with an arbitrary custom function that returns the data structure you want to explore, and then bind widgets to that function’s arguments. Then when one of those widgets changes, the function will get called to get the updated output.
To keep this example self-contained, we’ll illustrate this process using a simple function that filters the earthquakes dataset by event type and returns a DataFrame. Of course, this function could include any computation that returns a DataFrame, including selecting data files on disk or making a query to a database.
def input_function(event_type):
df2 = df[['mag', 'depth', 'latitude', 'longitude', 'place', 'type']]
return df2[df2['type'] == event_type]
We can then create a Panel Select
widget with a few options and bind it to the input_function
. Calling .rx()
on the bound object is what allows it to be used in a reactive pipeline, as we previously did with rdf
.
event_types = pn.widgets.Select(options=['earthquake', 'quarry blast', 'explosion', 'ice quake'])
inputrx = pn.bind(input_function, event_types).rx()
inputrx[inputrx['mag'] > mag].head(2)
.rx() and HoloViews#
.rx()
lets you work naturally with the compositional HoloViews plots provided by .hvplot()
. Here, let’s combine such plots using the HoloViews +
operator:
mag_hist = filtered.hvplot(y='mag', kind='hist', width=300)
depth_hist = filtered.hvplot(y='depth', kind='hist', width=300)
mag_hist + depth_hist
These are the same two histograms we saw earlier, but now we can filter them on data dimensions like time
that aren’t even explicitly shown in the plot, using the Panel widgets.
Filtering earthquakes on a map#
To display the earthquakes on a map, we will first create a subset of the data to make it quick to update without needing Datashader.:
subset_df = df[
(df.northing < WEB_MERCATOR_LIMITS[1]) &
(df.mag > 4) &
(df.index >= '2017-01-01') &
(df.index <= '2018-01-01')]
Now we can make a new reactive DataFrame
from this new subselection:
subset = pn.rx(subset_df)
And now we can declare our widgets and use them to filter the reactive DataFrame
as before:
date_subrange = pn.widgets.DatetimeRangeSlider(
name='Date', start=subset_df.index[0], end=subset_df.index[-1])
mag_subrange = pn.widgets.FloatSlider(name='Magnitude', start=3, end=9, value=3)
filtered_subrange = subset[
(subset.mag > mag_subrange) &
(subset.index >= date_subrange.param.value_start) &
(subset.index <= date_subrange.param.value_end)]
Now we can plot the earthquakes on an ESRI tilesource, including the filtering widgets as follows:
geo = filtered_subrange.hvplot(
'easting', 'northing', color='mag', kind='points',
xaxis=None, yaxis=None, responsive=True, min_height=500, tiles='ESRI')
geo
Terminating reactive expressions#
The examples above all illustrate cases where you can display the output of .rx()
and not worry about its type, since eventually you’ll get an object of the type you expect (a number, Matplotlib plot, HoloViews object, etc.), when it comes time to display something. But sometimes the actual return type from your expression does matter, and until it’s displayed, a reactive expression is an rx
object rather than a DataFrame or a HoloViews object or whatever your expression evaluates to:
type(geo)
param.reactive.rx
Panel will always render the rx
object dynamically, but because a reactive expression could return literally anything in Python, such rendering is not always efficient – what if only a small part of the result is changing, and you want only that bit updated, not redrawing the entire figure?
Or what if you need to work with some part of the reactive pipeline for your own purposes, e.g. to feed it to some function or object that does not understand rx
objects?
For both of these reasons you may want resolve the expression into an underlying object, which we call “termination” of the expression.
For instance, let’s create magnitude and depth histograms on this subset of the data as in an earlier notebook and see if we can enable linked selections on them:
mag_subhist = filtered_subrange.hvplot(y='mag', kind='hist', responsive=True, min_height=200)
depth_subhist = filtered_subrange.hvplot(y='depth', kind='hist', responsive=True, min_height=200)
combined = mag_subhist + depth_subhist
pn.Row(combined, height=400)
Note that this looks like a HoloViews layout with some widgets, because the expression evaluates to such a layout when displayed, but this object is not a HoloViews object. Instead it is still an rx
object:
type(combined)
param.reactive.rx
link_selections
does not currently understand rx
objects, and so it will raise an exception when given one. Luckily, a HoloViews DynamicMap
does understand rx
, so if we need a HoloViews Layout
for calling link_selections
, we can build a layout from the constituent objects by passing each expression to a HoloViews DynamicMap
:
layout = hv.DynamicMap(mag_subhist) + hv.DynamicMap(depth_subhist)
layout
This is now a HoloViews object, so we can use it with link_selections
:
print(type(layout))
ls = hv.link_selections.instance()
ls(hv.DynamicMap(mag_subhist) + hv.DynamicMap(depth_subhist))
<class 'holoviews.core.layout.Layout'>
You can use the box selection tool to see how selections compare between these plots. However, you will note that the separate widgets are not automatically displayed for terminated expressions like this. To address this, we can obtain the widgets from the panel ReactiveExpr
pane:
pn.param.ReactiveExpr(filtered_subrange).widgets
Lastly we can also pass expressions directly to Panel components, assuming the value returned by the expression matches the type expected by Panel. As an example we can compute the maximum magnitude of our filtered data and pass it to a Number
indicator:
pn.indicators.Number(value=filtered_subrange.mag.max(), name='Magnitude')
Similarly we could pass an expression returning a DataFrame
to the Tabulator
widget or an expression returning a matplotlib plot to a Matplotlib
pane. Having a particular component resolve our expressions allows Panel to efficiently re-render the component.
Conclusion#
Using the techniques above, you can build up a collection of plots and other outputs with Panel widgets to control individual bits of computation and display.
What if you want to collect these pieces and put them together into a standalone app or dashboard? If so, then the next tutorial will show you how to do so!