Release announcement for hvPlot 0.11, including: DuckDB integration, automatic lat/lon conversion on tiled maps, subcoordinate-y axis support, and more!
Author
Maxime Liquet
Published
September 27, 2024
What is hvPlot?
hvPlot is an open-source library that offers powerful high-level functionality for data exploration and visualization that doesn’t require you to learn a new API. You can get powerful interactive and compositional Bokeh, Matplotlib, or Plotly plots by simply replacing .plot with .hvplot. hvPlot makes all the analytical power of the HoloViz ecosystem available, using the APIs you already know.
New release!
We are very pleased to announce the 0.11 release of hvPlot! We’ll describe the main changes, including:
New integration: DuckDB!
Automatic latitude/longitude conversion when displaying a tile map
Support for displaying subcoordinate y-axis
New hover options: hover_tooltips and hover_formatters
Optimized Pandas index support
Fixing “No output in jupyter”
Update of the minimum version of the dependencies
As usual the full change log is available on GitHub.
Many thanks to @Azaya89, @liufeimath and @philipc2 for their first contributions, to @iuryt for contributing again, and to the maintainers @ahuang11, @hoxbro, @maximlt and @philippjfr!
You can install hvPlot with pip install hvplot, or with conda install hvplot (or conda install conda-forge::hvplot) if you are using Anaconda.
🌟 An easy way to support hvPlot is to give it a star on Github! 🌟
Install DuckDB with pip install duckdb or conda install conda-forge::python-duckdb and import hvplot.duckdb to enable the integration. .hvplot() supports DuckDBDuckDBPyRelation and DuckDBConnection objects. In the example below, we create a DuckDB in-memory connection (from a Pandas DataFrame to make it simple) and just plot it with .hvplot.line(...).
DuckDBPyRelation is a bit more optimized because it handles column subsetting directly within DuckDB before the data is converted to a pd.DataFrame. So, it’s a good idea to use the connection.sql() method when possible, which gives you a DuckDBPyRelation, instead of connection.execute(), which returns a DuckDBPyConnection.
sql_expr ="SELECT * FROM example_view WHERE A > 0 AND B > 0"connection.sql(sql_expr).hvplot.line(y=['A', 'B'], hover_cols=["C"]) # subsets A, B, C
Automatic latitude/longitude conversion when displaying a tile map
A pretty common situation when dealing with geographic data is to have the data expressed in terms of latitude/longitude (e.g. (52.520008°, 13.404954°) for Berlin), typically GPS coordinates. To display this data on a tile map (think Google Map), it needs to be projected to the Pseudo-Mercator projection that is the de facto standard for Web mapping applications (e.g. (6894701.26m, 1492232.65m)for Berlin). Up until this release, you could perform that projection by:
projecting the data yourself with a utility available in HoloViews (from holoviews.util.transform import lon_lat_to_easting_northing)
With this release and when you set tiles, hvPlot projects latitude/longitude (EPSG:4326 / WGS84) to easting/northing (EPSG:3857 / Pseudo-Mercator) coordinates without additional package dependencies if it detects that the values falls within expected latitude/longitude ranges. This automatic projection can be disabled with projection=False. Find out more in the Geographic Guide.
hvPlot enables you to create overlays where each element has its own distinct y-axis subcoordinate system (added in HoloViews 0.18.0). To activate this feature that automatically distributes overlay elements along the y-axis, set the subcoordinate_y keyword to True. For example, this feature is particularly useful to analyse multiple timeseries.
import numpy as npimport hvplot.pandas # noqafrom bokeh.sampledata.sea_surface_temperature import sea_surface_temperature as sstsst = sst.assign(locations=np.random.choice(['loc1', 'loc2', 'loc3', 'loc4'], size=len(sst)))sst.head(2)
temperature
locations
time
2016-02-15 00:00:00+00:00
4.929
loc3
2016-02-15 00:30:00+00:00
4.887
loc4
sst.hvplot(by='locations', subcoordinate_y=True)
Try zooming in the plot above, the y-axis wheel-zoom will apply to each curve’s respective sub-coordinate y-axis, rather than the global coordinate frame.
subcoordinate_y also accepts a dictionary of related options, for example set subcoordinate_y={'subcoordinate_scale': 2} to increase the scale of each sub-plot, resulting in each curve’s vertical range overlapping 50% with its adjacent elements, which allows creating simple ridge plots. Let us know in this Github issue if you’d be interested in a more extensive API to generate ridge plots.
New hover options: hover_tooltips and hover_formatters
The hover_tooltips and hover_formatters keywords have been added to complement hover and hover_cols. In order to customize the Bokeh hover tool, hvPlot users previously had to import and configure the HoverTool model from Bokeh’s API. With these two new options added in HoloViews 1.19.0, you can now directly customize the hover tool wihout any additional import. Find out more about the values accepted by these options in HoloViews’ Plotting with Bokeh guide.
HoloViews 1.19.0 came with optimizations around how Pandas DataFrame indexes are handled, effectively no longer internally calling .reset_index(), which was affecting memory usage and speed. Following HoloViews, hvPlot’s code base was adapted accordingly, making sure that in most cases .reset_index() is not called. This had the benefit to improve the handling of wide datasets too. No pretty plot for this enhancement 😊 But it’s a change that touched some deeper part of the two code bases so we wanted everyone to be aware of it and report any issues.
Fixing “No output in jupyter”
An unfortunately too common issue when using hvPlot in a Jupyter Notebook was that sometimes the plots would not just show up no matter how hard you tried, even with after following the classic procedure: notebook cleaning + notebook saving + browser tab hard refresh 😔 The import hvplot.<integration> import mechanism is a convenient way to allow users to have to avoid running the HoloViews/Panel extensions (e.g. hv.extension('bokeh')). However, since Python imports are cached, only the first import actually embeds the extension JavaScript code, meaning that if you re-run the cell(s) containing import hvplot.pandas (or some other integration) then the JavaScript will no longer be available and on subsequent reloads/re-runs of the notebook plots may not appear.
In this release, hvPlot adds an IPython hook which simply deletes the imported modules before every cell execution. This is a big hammer but the best we could find! Don’t hesitate to provide us feedback if you encounter any issue related to this change.
Update of the minimum version of the dependencies
This regular maintenance practice had not been done in a while. Most notably, hvPlot now depends on holoviews>=1.19.0 (compared to >=0.11.0 previously) to ensure its users benefits from all the new features HoloViews has made available in the recent years. Additionally, hvPlot 0.11 requires Python 3.9 and above. For more details check the diff of the Pull Request that implemented this change.