Aggregate Transforms
There are two ways to aggregate data within Altair: within the encoding itself, or using a top level aggregate transform.
The aggregate property of a field definition can be used to compute aggregate summary statistics (e.g., median, min, max) over groups of data.
If at least one fields in the specified encoding channels contain aggregate, the resulting visualization will show aggregate data. In this case, all fields without aggregation function specified are treated as group-by fields in the aggregation process.
For example, the following bar chart aggregates mean of acceleration
,
grouped by the number of Cylinders.
import altair as alt
from vega_datasets import data
cars = data.cars.url
alt.Chart(cars).mark_bar().encode(
y='Cylinders:O',
x='mean(Acceleration):Q',
)
The Altair shorthand string:
# ...
x='mean(Acceleration):Q',
# ...
is made available for convenience, and is equivalent to the longer form:
# ...
x=alt.X(field='Acceleration', aggregate='mean', type='quantitative'),
# ...
For more information on shorthand encodings specifications, see Binning and Aggregation.
The same plot can be shown using an explicitly computed aggregation, using the
transform_aggregate()
method:
alt.Chart(cars).mark_bar().encode(
y='Cylinders:O',
x='mean_acc:Q'
).transform_aggregate(
mean_acc='mean(Acceleration)',
groupby=["Cylinders"]
)
For a list of available aggregates, see Binning and Aggregation.
Transform Options
The transform_aggregate()
method is built on the AggregateTransform
class, which has the following options:
Property |
Type |
Description |
---|---|---|
aggregate |
array( |
Array of objects that define fields to aggregate. |
groupby |
array( |
The data fields to group by. If not specified, a single group containing all data objects will be used. |
The AggregatedFieldDef
objects have the following options:
Property |
Type |
Description |
---|---|---|
as |
The output field names to use for each aggregated field. |
|
field |
The data field for which to compute aggregate function. This is required for all
aggregation operations except |
|
op |
The aggregation operation to apply to the fields (e.g., |