datascience.tables.Table.hist

Table.hist(*columns, overlay=True, bins=None, bin_column=None, unit=None, counts=None, group=None, rug=False, side_by_side=False, left_end=None, right_end=None, width=None, height=None, **vargs)[source]

Plots one histogram for each column in columns. If no column is specified, plot all columns. If interactive plots are enabled via Table#interactive_plots, redirects plotting to plotly with Table#ihist.

Kwargs:
overlay (bool): If True, plots 1 chart with all the histograms

overlaid on top of each other (instead of the default behavior of one histogram for each column in the table). Also adds a legend that matches each bar color to its column. Note that if the histograms are not overlaid, they are not forced to the same scale.

bins (list or int): Lower bound for each bin in the

histogram or number of bins. If None, bins will be chosen automatically.

bin_column (column name or index): A column of bin lower bounds.

All other columns are treated as counts of these bins. If None, each value in each row is assigned a count of 1.

counts (column name or index): Deprecated name for bin_column.

unit (string): A name for the units of the plotted column (e.g.

‘kg’), to be used in the plot.

group (column name or index): A column of categories. The rows are

grouped by the values in this column, and a separate histogram is generated for each group. The histograms are overlaid or plotted separately depending on the overlay argument. If None, no such grouping is done.

side_by_side (bool): Whether histogram bins should be plotted side by

side (instead of directly overlaid). Makes sense only when plotting multiple histograms, either by passing several columns or by using the group option.

left_end (int or float) and right_end (int or float): (Not supported

for overlayed histograms) The left and right edges of the shading of the histogram. If only one of these is None, then that property will be treated as the extreme edge of the histogram. If both are left None, then no shading will occur.

density (boolean): If True, will plot a density distribution of the data.

Otherwise plots the counts.

shade_split (string, {“whole”, “new”, “split”}): If left_end or

right_end are specified, shade_split determines how a bin is split that the end falls between two bin endpoints. If shade_split = “whole”, the entire bin will be shaded. If shade_split = “new”, then a new bin will be created and data split appropriately. If shade_split = “split”, the data will first be placed into the original bins, and then separated into two bins with equal height.

show (bool): whether to show the figure for interactive plots; if false, the figure is

returned instead

vargs: Additional arguments that get passed into :func:plt.hist.

See http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hist for additional arguments that can be passed into vargs. These include: range, normed/density, cumulative, and orientation, to name a few.

>>> t = Table().with_columns(
...     'count',  make_array(9, 3, 3, 1),
...     'points', make_array(1, 2, 2, 10))
>>> t
count | points
9     | 1
3     | 2
3     | 2
1     | 10
>>> t.hist() 
<histogram of values in count>
<histogram of values in points>
>>> t = Table().with_columns(
...     'value',      make_array(101, 102, 103),
...     'proportion', make_array(0.25, 0.5, 0.25))
>>> t.hist(bin_column='value') 
<histogram of values weighted by corresponding proportions>
>>> t = Table().with_columns(
...     'value',    make_array(1,   2,   3,   2,   5  ),
...     'category', make_array('a', 'a', 'a', 'b', 'b'))
>>> t.hist('value', group='category') 
<two overlaid histograms of the data [1, 2, 3] and [2, 5]>