Hist#

import hvplot.pandas  # noqa

# hvplot.extension("matplotlib")

hist is often a good way to start looking at continuous data to get a sense of the distribution. Similar methods include kde (also available as density).

from bokeh.sampledata.autompg import autompg_clean

autompg_clean.sample(n=5)

	mpg	cyl	displ	hp	weight	accel	yr	origin	name	mfr
205	20.0	4	130.0	102	3150	15.7	76	Europe	volvo 245	volvo
230	16.0	8	351.0	149	4335	14.5	77	North America	ford thunderbird	ford
283	17.0	8	305.0	130	3840	15.4	79	North America	chevrolet caprice classic	chevrolet
76	22.0	4	121.0	76	2511	18.0	72	Europe	volkswagen 411 (sw)	volkswagen
162	21.0	6	231.0	110	3039	15.0	75	North America	buick skyhawk	buick

autompg_clean.hvplot.hist("weight")

When using by the plots are overlaid by default. To create subplots instead, use subplots=True.

autompg_clean.hvplot.hist("weight", by="origin", subplots=True, width=250)

You can also plot histograms of datetime data

import pandas as pd
from bokeh.sampledata.commits import data as commits

commits = commits.reset_index().sort_values("datetime")
commits.head(3)

	datetime	day	time
4915	2012-12-29 11:57:50-06:00	Sat	11:57:50
4914	2013-01-02 17:46:43-06:00	Wed	17:46:43
4913	2013-01-03 16:28:49-06:00	Thu	16:28:49

commits.hvplot.hist(
    "datetime",
    bin_range=(pd.Timestamp('2012-11-30'), pd.Timestamp('2017-05-01')),
    bins=54,   
)

If you want to plot the distribution of a categorical column you can calculate the distribution using Pandas’ method value_counts and plot it using .hvplot.bar.

autompg_clean["mfr"].value_counts().hvplot.bar(invert=True, flip_yaxis=True, height=500)

This web page was generated from a Jupyter notebook and not all interactivity will work on this website.

Hist#

This Page