Aggregate Transforms

There are two ways to aggregate data within Altair: within the encoding itself, or using a top level aggregate transform.

The aggregate property of a field definition can be used to compute aggregate summary statistics (e.g., median, min, max) over groups of data.

If at least one fields in the specified encoding channels contain aggregate, the resulting visualization will show aggregate data. In this case, all fields without aggregation function specified are treated as group-by fields in the aggregation process.

For example, the following bar chart aggregates mean of acceleration, grouped by the number of Cylinders.

import altair as alt
from vega_datasets import data

cars = data.cars.url

alt.Chart(cars).mark_bar().encode(
    y='Cylinders:O',
    x='mean(Acceleration):Q',
)

The Altair shorthand string:

# ...
x='mean(Acceleration):Q',
# ...

is made available for convenience, and is equivalent to the longer form:

# ...
x=alt.X(field='Acceleration', aggregate='mean', type='quantitative'),
# ...

For more information on shorthand encodings specifications, see Binning and Aggregation.

The same plot can be shown using an explicitly computed aggregation, using the transform_aggregate() method:

alt.Chart(cars).mark_bar().encode(
    y='Cylinders:O',
    x='mean_acc:Q'
).transform_aggregate(
    mean_acc='mean(Acceleration)',
    groupby=["Cylinders"]
)

For a list of available aggregates, see Binning and Aggregation.

Transform Options

The transform_aggregate() method is built on the AggregateTransform class, which has the following options:

Property Type Description
aggregate array(AggregatedFieldDef) Array of objects that define fields to aggregate.
groupby array(FieldName) The data fields to group by. If not specified, a single group containing all data objects will be used.

The AggregatedFieldDef objects have the following options:

Property Type Description
as FieldName The output field names to use for each aggregated field.
field FieldName The data field for which to compute aggregate function. This is required for all aggregation operations except "count".
op AggregateOp The aggregation operation to apply to the fields (e.g., "sum", "average", or "count"). See the full list of supported aggregation operations for more information.