datascience.tables.Table.group¶
- Table.group(column_or_label, collect=None)[source]¶
Group rows by unique values in a column; count or aggregate others.
- Args:
column_or_label: values to group (column label or index, or array)collect: a function applied to values in other columns for each group- Returns:
A Table with each row corresponding to a unique value in
column_or_label, where the first column contains the unique values fromcolumn_or_label, and the second contains counts for each of the unique values. Ifcollectis provided, a Table is returned with all original columns, each containing values calculated by first grouping rows according tocolumn_or_label, then applyingcollectto each set of grouped values in the other columns.- Note:
The grouped column will appear first in the result table. If
collectdoes not accept arguments with one of the column types, that column will be empty in the resulting table.
>>> marbles = Table().with_columns( ... "Color", make_array("Red", "Green", "Blue", "Red", "Green", "Green"), ... "Shape", make_array("Round", "Rectangular", "Rectangular", "Round", "Rectangular", "Round"), ... "Amount", make_array(4, 6, 12, 7, 9, 2), ... "Price", make_array(1.30, 1.30, 2.00, 1.75, 1.40, 1.00)) >>> marbles Color | Shape | Amount | Price Red | Round | 4 | 1.3 Green | Rectangular | 6 | 1.3 Blue | Rectangular | 12 | 2 Red | Round | 7 | 1.75 Green | Rectangular | 9 | 1.4 Green | Round | 2 | 1 >>> marbles.group("Color") # just gives counts Color | count Blue | 1 Green | 3 Red | 2 >>> marbles.group("Color", max) # takes the max of each grouping, in each column Color | Shape max | Amount max | Price max Blue | Rectangular | 12 | 2 Green | Round | 9 | 1.4 Red | Round | 7 | 1.75 >>> marbles.group("Shape", sum) # sum doesn't make sense for strings Shape | Color sum | Amount sum | Price sum Rectangular | | 27 | 4.7 Round | | 13 | 4.05