{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Geographic Data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import hvplot.pandas # noqa\n", "import hvplot.xarray # noqa\n", "import xarray as xr\n", "\n", "from bokeh.sampledata.airport_routes import airports" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "hvPlot can display geographic data and can be used to add contextualisation layers:\n", "\n", "- Overlay the data over a tiled web map\n", "- Overlay the data over geographic/administrative features\n", "- Plot the data in a specific projection\n", "\n", "Many, but not all, of these features require [GeoViews](https://geoviews.org) to be installed, which is the geographic extension of HoloViews built on [Cartopy](https://scitools.org.uk/cartopy) and [pyproj](https://pyproj4.github.io/pyproj). You can install GeoViews with `pip install geoviews` or `conda install geoviews`.\n", "\n", "Before plotting geographical data, you should know in which [coordinate system](https://en.wikipedia.org/wiki/Spatial_reference_system) the data is represented. This is important for a few reasons:\n", "\n", "- The data may have to be [projected](https://en.wikipedia.org/wiki/Map_projection) to another coordinate system, and to do so you may have to provide the original coordinate system.\n", "- Projecting data can be a computationally expensive operation and you should be aware of when this operation happens, to perhaps optimize it.\n", "\n", "Here are two common coordinate systems you will encounter when dealing with geographic data and using hvPlot:\n", "\n", "- Data represented by latitude/longitude (lat/lon) coordinates are often tied to the WGS84 geographic coordinate system ([EPSG:4326](https://epsg.io/4326)), that's how GPS data are located on the Earth. For example, the coordinates of Paris in this system are `(lat=48.856697, lon=2.351462)`.\n", "- Tiled web maps are displayed in the Web Mercator projection (also known as Pseudo-Mercator or Google Mercator, [EPSG:3857](https://epsg.io/3857)). Unlike WGS84 which expresses coordinates in lat/long decimal degrees, Web Mercator expresses them in easting (X) /northing (Y) metric units. For example, the coordinates of Paris in this system are `(easting=261763.552, northing=6250580.761)`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Supported plot types\n", "\n", "Only certain hvPlot types support geographic coordinates, currently including: `points`, `polygons`, `paths`, `image`, `quadmesh`, `contour`, and `contourf`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tiled web map\n", "\n", "A [tiled web map](https://en.wikipedia.org/wiki/Tiled_web_map), or tile map, is an interactive map displayed on a web browser that is divided into small, pre-rendered image tiles, allowing for efficient loading and seamless navigation across various zoom levels and geographic areas. hvPlot allows to add a tile map as a basemap to a plot with the `tiles` parameter. Importantly, `tiles` is a parameter that can be used **without installing GeoViews**.\n", "\n", "We'll display this dataframe of all US airports (including military bases overseas), the points are expressed in latitude/longitude coordinates:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airports.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll first start by displaying the airports **without GeoViews** with tiles by setting `tiles=True`. \n", "\n", "Under the hood, hvPlot projects lat/lon to easting/northing ([EPSG:4326](https://epsg.io/4326) to [EPSG:3857](https://epsg.io/3857)) coordinates without additional package dependencies if it detects that the values falls within expected lat/lon ranges, **unless the data is lazily loaded (dask / ibis).**\n", "\n", "Note, **this feature is only available after `hvplot>=0.11.0`**; older versions, `hvplot<0.11.0`, require manual projection (see below)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airports.hvplot.points('Longitude', 'Latitude', tiles=True, color='red', alpha=0.2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you do not want it to be auto-projected under the hood for any reason, set `projection=False`, **but note it will not be properly placed on the tiles** without manual intervention. In doing so, the data will be plotted around the [null island location](https://en.wikipedia.org/wiki/Null_Island), roughly 600km off the coast of West Africa. `padding` was added to demonstrate this." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airports.hvplot.points('Longitude', 'Latitude', tiles=True, projection=False, color='red', alpha=0.2, padding=10000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To manually project use the `lon_lat_to_easting_northing` function provided by HoloViews, that doesn't require installing any of the usual geo dependencies like `pyproj` or `cartopy`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from holoviews.util.transform import lon_lat_to_easting_northing\n", "\n", "airports['x'], airports['y'] = lon_lat_to_easting_northing(airports.Longitude, airports.Latitude)\n", "airports[['Latitude', 'Longitude', 'x', 'y']].head(3)\n", "airports.hvplot.points('x', 'y', tiles=True, color='red', alpha=0.2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When **GeoViews** is installed, you can set `geo=True` to let hvPlot know that it can leverage it to display the data. You can directly plot the airports without having to project the data yourself, GeoViews will take care of projecting it to Web Mercator automatically." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airports.hvplot.points(\n", " 'Longitude', 'Latitude', geo=True, tiles=True, color='red', alpha=0.2\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The easiest ways to set `tiles` include:\n", "\n", "- to `True` to get the default layer which currently is from OpenStreetMap (prefer one of the explicit approaches below if you care about reproducibility)\n", "- with an `xyzservices.TileProvider` instance that can be created from the optional dependency [xyzservices](https://xyzservices.readthedocs.io/) which gives you access to hundreds of tiled web maps\n", "- with the name of one the layers shipped by HoloViews and GeoViews (when it's installed), like `'EsriTerrain'`.\n", "\n", "The tile map can be additionally configured by setting `tiles_opts` with a dictionary of options that will be applied to the basemap layer." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import xyzservices.providers as xyz" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airports.hvplot.points(\n", " 'x', 'y', tiles=xyz.Esri.WorldPhysical, tiles_opts={'alpha': 0.5}, color='red', alpha=0.2\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data and plot projections" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import cartopy.crs as ccrs " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we have seen in the previous section, you can set `geo=True` to let hvPlot know that your data is geographic. GeoViews **must be installed** when setting `geo=True`, or when any of the other geographic options is set, except `tiles`. In this mode hvPlot:\n", "\n", "1. Assumes the data is expressed in lat/lon coordinates. Internally, hvPlot assumes a [Plate Carrée](https://en.wikipedia.org/wiki/Equirectangular_projection) projection, also known as the *equirectangular projection*.\n", "2. Displays the data in the projection *Plate Carrée* assumed in step (1), or, if `tiles=True`, projects and displays the data in the Web Mercator projection.\n", "\n", "The *input* data projection assumed in step (1), and the *output* plot projection used in step (2) can both be overridden with the `crs` (for Coordinate Reference System) and `projection` parameters, respectively. Both parameters accept cartopy `CRS` objects (see the full list of values they accept in their definition). You can now see that that setting `geo=True, tiles=True` is strictly equivalent to setting `crs=ccrs.PlateCarree(), projection=ccrs.GOOGLE_MERCATOR`! Coordinate reference systems are described in more details in the [GeoViews documentation](https://geoviews.org/user_guide/Projections.html) and the full list of available CRSs is in the [cartopy documentation](https://scitools.org.uk/cartopy/docs/latest/reference/projections.html).\n", "\n", "For example, we can set `projection` to display the airports on an Orthographic projection centered over North America." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airports.hvplot.points(\n", " 'Longitude', 'Latitude', color='red', alpha=0.2,\n", " coastline=True, projection=ccrs.Orthographic(-90, 30)\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Raster data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "air_ds = xr.tutorial.open_dataset('air_temperature').load()\n", "air_ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When displaying raster data in a projection other than the one in which the data is stored, it is more accurate to render it as a `quadmesh` rather than an `image`. As you can see below, a QuadMesh will project each original bin or pixel into the correct non-rectangular shape determined by the projection, accurately showing the geographic extent covered by each sample. An Image, on the other hand, will always be rectangularly aligned in the 2D plane, which requires warping and resampling the data in a way that allows efficient display but loses accuracy at the pixel level." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "air_ds.isel(time=0).hvplot.quadmesh('lon', 'lat', 'air', projection=ccrs.LambertConformal())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Unfortunately, rendering a large QuadMesh using Bokeh can be very slow, but there are two useful alternatives for datasets too large to be practical as native QuadMeshes.\n", "\n", "The first is using the `rasterize` or `datashade` options to regrid the data before rendering it, i.e., rendering the data on the backend and then sending a more efficient image-based representation to the browser. One thing to note when using these operations is that it may be necessary to project the data **before** rasterizing it, e.g. to address wrapping issues. To do this provide `project=True`, which will project the data before it is rasterized (this also works for other types and even when not using these operations). Another reason why this is important when rasterizing the data is that if the CRS of the data does not match the displayed projection, all the data will be projected every time you zoom or pan, which can be very slow. Deciding whether to `project` is therefore a tradeoff between projecting the raw data ahead of time or accepting the overhead on dynamic zoom and pan actions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rasm = xr.tutorial.open_dataset('rasm').load().isel(time=0)\n", "rasm" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rasm.hvplot.quadmesh(\n", " 'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.GOOGLE_MERCATOR,\n", " tiles=xyz.Esri.WorldGrayCanvas, project=True, rasterize=True,\n", " xlim=(-20, 40), ylim=(30, 70), cmap='viridis', frame_width=400,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The second option that's still relatively slow for larger data but avoids sending large data into your browser is to plot the data using `contour` and `contourf` visualizations, generating a line or filled contour with a discrete number of levels:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rasm.hvplot.contourf(\n", " 'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.GOOGLE_MERCATOR,\n", " tiles=xyz.Esri.WorldGrayCanvas, levels=10,\n", " xlim=(-20, 40), ylim=(30, 70), cmap='viridis', frame_width=400,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Geopandas usage\n", "\n", "Since a GeoPandas `DataFrame` is just a Pandas DataFrames with additional geographic information, it inherits the `.hvplot` method. We can thus easily load geographic files (GeoJSON, Shapefiles, etc) and plot them on a map:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import geopandas as gpd\n", "import geodatasets" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "path = geodatasets.get_path(\"geoda.nyc_neighborhoods\")\n", "path" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nyc = gpd.read_file(path)\n", "nyc.head(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nyc.hvplot(geo=True, tiles=True, c=\"poptot\", alpha=0.8, tiles_opts={'alpha': 0.5})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The GeoPandas support allows plotting `GeoDataFrames` containing `'Point'`, `'Polygon'`, `'LineString'` and `'LineRing'` geometries, but not ones containing a mixture of different geometry types. Calling `.hvplot` will automatically figure out the geometry type to plot, but it also possible to call `.hvplot.points`, `.hvplot.polygons`, and `.hvplot.paths` explicitly." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Spatialpandas usage\n", "\n", "Spatialpandas is another powerful library for working with geometries and is optimized for rendering with datashader, making it possible to plot millions of individual geometries very quickly:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import spatialpandas as spd\n", "\n", "spd_nyc = spd.GeoDataFrame(nyc)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "spd_nyc.hvplot(\n", " datashade=True, project=True, aggregator='count_cat',\n", " c='boroname', color_key='Category10'\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, hvPlot makes it simple to work with geographic data visually. For more complex plot types and additional details, see the [GeoViews](https://geoviews.org) documentation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Options\n", "\n", "The API provides various geo-specific options:\n", "\n", "- `coastline` (default=False): Whether to display a coastline on top of the plot, setting `coastline='10m'/'50m'/'110m'` specifies a specific scale\n", "- `crs` (default=None): Coordinate reference system of the data (input projection) specified as a string or integer EPSG code, a CRS or Proj pyproj object, a Cartopy CRS object or class name, a WKT string, or a proj.4 string. Defaults to PlateCarree.\n", "- `features` features (default=None): A list of features or a dictionary of features and the scale at which to render it. Available features include 'borders', 'coastline', 'lakes', 'land', 'ocean', 'rivers' and 'states'. Available scales include '10m'/'50m'/'110m'.\n", "- `geo` (default=False): Whether the plot should be treated as geographic (and assume PlateCarree, i.e. lat/lon coordinates)\n", "- `global_extent` (default=False): Whether to expand the plot extent to span the whole globe\n", "- `project` (default=False): Whether to project the data before plotting (adds initial overhead but avoids projecting data when plot is dynamically updated)\n", "- `projection` (default=None): Coordinate reference system of the plot (output projection) specified as a string or integer EPSG code, a CRS or Proj pyproj object, a Cartopy CRS object or class name, a WKT string, or a proj.4 string. Defaults to PlateCarree.\n", "- `tiles` (default=False): Whether to overlay the plot on a tile source. If coordinate values fall within lat/lon bounds, auto-projects to EPSG:3857 unless `projection=False` or if the data is lazily loaded (dask / ibis). Accepts the following values:\n", " - `True`: OpenStreetMap layer\n", " - `xyzservices.TileProvider` instance (requires [`xyzservices`](https://xyzservices.readthedocs.io/) to be installed)\n", " - a map string name based on one of the default layers made available by [HoloViews](https://holoviews.org/reference/elements/bokeh/Tiles.html) ('CartoDark', 'CartoLight', 'EsriImagery', 'EsriNatGeo', 'EsriUSATopo', 'EsriTerrain', 'EsriStreet', 'EsriReference', 'OSM', 'OpenTopoMap') or [GeoViews](https://geoviews.org/user_guide/Working_with_Bokeh.html) ('CartoDark', 'CartoEco', 'CartoLight', 'CartoMidnight', 'EsriImagery', 'EsriNatGeo', 'EsriUSATopo', 'EsriTerrain', 'EsriReference', 'EsriOceanBase', 'EsriOceanReference', 'EsriWorldPhysical', 'EsriWorldShadedRelief', 'EsriWorldTopo', 'EsriWorldDarkGrayBase', 'EsriWorldDarkGrayReference', 'EsriWorldLightGrayBase', 'EsriWorldLightGrayReference', 'EsriWorldHillshadeDark', 'EsriWorldHillshade', 'EsriAntarcticImagery', 'EsriArcticImagery', 'EsriArcticOceanBase', 'EsriArcticOceanReference', 'EsriWorldBoundariesAndPlaces', 'EsriWorldBoundariesAndPlacesAlternate', 'EsriWorldTransportation', 'EsriDelormeWorldBaseMap', 'EsriWorldNavigationCharts', 'EsriWorldStreetMap', 'OSM', 'OpenTopoMap'). Note that Stamen tile sources require a Stadia account when not running locally; see [stadiamaps.com](https://stadiamaps.com/).\n", " - a `holoviews.Tiles` or `geoviews.WMTS` instance or class\n", "- `tiles_opts` (default=None): Dictionary of plotting options to customize the tiles layer created when `tiles` is set, e.g. `dict(alpha=0.5)`." ] } ], "metadata": { "language_info": { "name": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 4 }