Plotting made easy with hvPlot: 0.11 release

release
hvplot
Release announcement for hvPlot 0.11, including: DuckDB integration, automatic lat/lon conversion on tiled maps, subcoordinate-y axis support, and more!
Author

Maxime Liquet

Published

September 27, 2024

What is hvPlot?

hvPlot is an open-source library that offers powerful high-level functionality for data exploration and visualization that doesn’t require you to learn a new API. You can get powerful interactive and compositional Bokeh, Matplotlib, or Plotly plots by simply replacing .plot with .hvplot. hvPlot makes all the analytical power of the HoloViz ecosystem available, using the APIs you already know.

New release!

We are very pleased to announce the 0.11 release of hvPlot! We’ll describe the main changes, including:

  • New integration: DuckDB!
  • Automatic latitude/longitude conversion when displaying a tile map
  • Support for displaying subcoordinate y-axis
  • New hover options: hover_tooltips and hover_formatters
  • Optimized Pandas index support
  • Fixing “No output in jupyter”
  • Update of the minimum version of the dependencies

As usual the full change log is available on GitHub.

Many thanks to @Azaya89, @liufeimath and @philipc2 for their first contributions, to @iuryt for contributing again, and to the maintainers @ahuang11, @hoxbro, @maximlt and @philippjfr!


You can install hvPlot with pip install hvplot, or with conda install hvplot (or conda install conda-forge::hvplot) if you are using Anaconda.


🌟 An easy way to support hvPlot is to give it a star on Github! 🌟

New integration: DuckDB!

hvPlot has added DuckDB to the long list of libraries it integrates with. Thanks Andrew!

Install DuckDB with pip install duckdb or conda install conda-forge::python-duckdb and import hvplot.duckdb to enable the integration. .hvplot() supports DuckDB DuckDBPyRelation and DuckDBConnection objects. In the example below, we create a DuckDB in-memory connection (from a Pandas DataFrame to make it simple) and just plot it with .hvplot.line(...).

import duckdb
import numpy as np
import pandas as pd
import hvplot.duckdb  # noqa 

df_pandas = pd.DataFrame(np.random.randn(1000, 4), columns=list('ABCD')).cumsum()
connection = duckdb.connect(':memory:')
relation = duckdb.from_df(df_pandas, connection=connection)
relation.to_view("example_view");
relation.describe()
┌─────────┬─────────────────────┬────────────────────┬────────────────────┬────────────────────┐
│  aggr   │          A          │         B          │         C          │         D          │
│ varchar │       double        │       double       │       double       │       double       │
├─────────┼─────────────────────┼────────────────────┼────────────────────┼────────────────────┤
│ count   │              1000.0 │             1000.0 │             1000.0 │             1000.0 │
│ mean    │ -10.924991575731047 │ 0.8319325462565168 │ 31.451518655996168 │ 12.928098370709069 │
│ stddev  │  11.870831403141286 │  9.514072462296758 │ 10.788623199269546 │ 16.225677822936902 │
│ min     │   -38.4522539834605 │ -25.74895137984497 │ 0.9040969837197796 │  -15.6946781879896 │
│ max     │   9.983881228798655 │ 17.031471683448828 │   49.7826331906542 │  38.97909613503948 │
│ median  │   -8.13940182008442 │ 3.0560169324925086 │ 34.386376905116876 │ 15.738919164749506 │
└─────────┴─────────────────────┴────────────────────┴────────────────────┴────────────────────┘
relation.hvplot.line(y=['A', 'B', 'C', 'D'])

DuckDBPyRelation is a bit more optimized because it handles column subsetting directly within DuckDB before the data is converted to a pd.DataFrame. So, it’s a good idea to use the connection.sql() method when possible, which gives you a DuckDBPyRelation, instead of connection.execute(), which returns a DuckDBPyConnection.

sql_expr = "SELECT * FROM example_view WHERE A > 0 AND B > 0"
connection.sql(sql_expr).hvplot.line(y=['A', 'B'], hover_cols=["C"])  # subsets A, B, C

Automatic latitude/longitude conversion when displaying a tile map

A pretty common situation when dealing with geographic data is to have the data expressed in terms of latitude/longitude (e.g. (52.520008°, 13.404954°) for Berlin), typically GPS coordinates. To display this data on a tile map (think Google Map), it needs to be projected to the Pseudo-Mercator projection that is the de facto standard for Web mapping applications (e.g. (6894701.26m, 1492232.65m)for Berlin). Up until this release, you could perform that projection by:

  • installing GeoViews and setting geo=True, or
  • projecting the data yourself with a utility available in HoloViews (from holoviews.util.transform import lon_lat_to_easting_northing)

With this release and when you set tiles, hvPlot projects latitude/longitude (EPSG:4326 / WGS84) to easting/northing (EPSG:3857 / Pseudo-Mercator) coordinates without additional package dependencies if it detects that the values falls within expected latitude/longitude ranges. This automatic projection can be disabled with projection=False. Find out more in the Geographic Guide.

import hvplot.pandas  # noqa
from bokeh.sampledata.airport_routes import airports

airports.head(2)
AirportID Name City Country IATA ICAO Latitude Longitude Altitude Timezone DST TZ Type source
0 3411 Barter Island LRRS Airport Barter Island United States BTI PABA 70.134003 -143.582001 2 -9 A America/Anchorage airport OurAirports
1 3413 Cape Lisburne LRRS Airport Cape Lisburne United States LUR PALU 68.875099 -166.110001 16 -9 A America/Anchorage airport OurAirports
airports.hvplot.points('Longitude', 'Latitude', tiles=True, color='red', alpha=0.2)

Support for displaying subcoordinate y-axis

hvPlot enables you to create overlays where each element has its own distinct y-axis subcoordinate system (added in HoloViews 0.18.0). To activate this feature that automatically distributes overlay elements along the y-axis, set the subcoordinate_y keyword to True. For example, this feature is particularly useful to analyse multiple timeseries.

import numpy as np
import hvplot.pandas  # noqa
from bokeh.sampledata.sea_surface_temperature import sea_surface_temperature as sst

sst = sst.assign(locations=np.random.choice(['loc1', 'loc2', 'loc3', 'loc4'], size=len(sst)))
sst.head(2)
temperature locations
time
2016-02-15 00:00:00+00:00 4.929 loc3
2016-02-15 00:30:00+00:00 4.887 loc4
sst.hvplot(by='locations', subcoordinate_y=True)

Try zooming in the plot above, the y-axis wheel-zoom will apply to each curve’s respective sub-coordinate y-axis, rather than the global coordinate frame.

subcoordinate_y also accepts a dictionary of related options, for example set subcoordinate_y={'subcoordinate_scale': 2} to increase the scale of each sub-plot, resulting in each curve’s vertical range overlapping 50% with its adjacent elements, which allows creating simple ridge plots. Let us know in this Github issue if you’d be interested in a more extensive API to generate ridge plots.

df = pd.DataFrame({'value': np.random.randn(200), 'cat': list("ABCD") * 50})
df['value'] += df['cat'].map(ord)
df.hvplot.kde(by='cat', y='value', subcoordinate_y={'subcoordinate_scale': 1.5}, legend=False, color="gray", hover=False)

More information about subcoordinate y-axis plots can be found in HoloViews’ customizing plots guide and in its gallery.

New hover options: hover_tooltips and hover_formatters

The hover_tooltips and hover_formatters keywords have been added to complement hover and hover_cols. In order to customize the Bokeh hover tool, hvPlot users previously had to import and configure the HoverTool model from Bokeh’s API. With these two new options added in HoloViews 1.19.0, you can now directly customize the hover tool wihout any additional import. Find out more about the values accepted by these options in HoloViews’ Plotting with Bokeh guide.

import hvplot.pandas  # noqa
from bokeh.sampledata.periodic_table import elements

elements.head(2)
atomic number symbol name atomic mass CPK electronic configuration electronegativity atomic radius ion radius van der Waals radius ... EA standard state bonding type melting point boiling point density metal year discovered group period
0 1 H Hydrogen 1.00794 #FFFFFF 1s1 2.2 37.0 NaN 120.0 ... -73.0 gas diatomic 14.0 20.0 0.00009 nonmetal 1766 1 1
1 2 He Helium 4.002602 #D9FFFF 1s2 NaN 32.0 NaN 140.0 ... 0.0 gas atomic NaN 4.0 0.00000 noble gas 1868 18 1

2 rows × 21 columns

elements.sort_values('metal').hvplot.points(
    'electronegativity', 'density', by='metal',
    hover_cols=['name', 'symbol', 'CPK'],
    hover_tooltips=[
        'name',
        ('Symbol', '@symbol'),
        ('CPK', '$color[hex, swatch]:CPK'),
        ('Density',  '@density{%.2e}'),
    ],
    hover_formatters={
        '@{density}': 'printf',
    }
)

Optimized Pandas index support

HoloViews 1.19.0 came with optimizations around how Pandas DataFrame indexes are handled, effectively no longer internally calling .reset_index(), which was affecting memory usage and speed. Following HoloViews, hvPlot’s code base was adapted accordingly, making sure that in most cases .reset_index() is not called. This had the benefit to improve the handling of wide datasets too. No pretty plot for this enhancement 😊 But it’s a change that touched some deeper part of the two code bases so we wanted everyone to be aware of it and report any issues.

Fixing “No output in jupyter”

An unfortunately too common issue when using hvPlot in a Jupyter Notebook was that sometimes the plots would not just show up no matter how hard you tried, even with after following the classic procedure: notebook cleaning + notebook saving + browser tab hard refresh 😔 The import hvplot.<integration> import mechanism is a convenient way to allow users to have to avoid running the HoloViews/Panel extensions (e.g. hv.extension('bokeh')). However, since Python imports are cached, only the first import actually embeds the extension JavaScript code, meaning that if you re-run the cell(s) containing import hvplot.pandas (or some other integration) then the JavaScript will no longer be available and on subsequent reloads/re-runs of the notebook plots may not appear.

In this release, hvPlot adds an IPython hook which simply deletes the imported modules before every cell execution. This is a big hammer but the best we could find! Don’t hesitate to provide us feedback if you encounter any issue related to this change.

Update of the minimum version of the dependencies

This regular maintenance practice had not been done in a while. Most notably, hvPlot now depends on holoviews>=1.19.0 (compared to >=0.11.0 previously) to ensure its users benefits from all the new features HoloViews has made available in the recent years. Additionally, hvPlot 0.11 requires Python 3.9 and above. For more details check the diff of the Pull Request that implemented this change.


Join us on Github, Discourse or Discord to help us improve hvPlot. Happy plotting 😊

Back to top