Pivot Transform

The pivot transform is, in short, a way to convert long-form data to wide-form data directly without any preprocessing (see Long-form vs. Wide-form Data for more information). Pivot transforms are useful for creating matrix or cross-tabulation data, acting as an inverse to the Fold Transform.

Here is an example, using Olympic medals data:

import altair as alt
import pandas as pd

df = pd.DataFrame.from_records([
    {"country": "Norway", "type": "gold", "count": 14},
    {"country": "Norway", "type": "silver", "count": 14},
    {"country": "Norway", "type": "bronze", "count": 11},
    {"country": "Germany", "type": "gold", "count": 14},
    {"country": "Germany", "type": "silver", "count": 10},
    {"country": "Germany", "type": "bronze", "count": 7},
    {"country": "Canada", "type": "gold", "count": 11},
    {"country": "Canada", "type": "silver", "count": 8},
    {"country": "Canada", "type": "bronze", "count": 10}


The pivot transform, when combined with other elements of the Altair grammar, enables some very interesting chart types. For example, here we use pivot to create a single tooltip for values on multiple lines:

import altair as alt
from vega_datasets import data

source = data.stocks()
base = alt.Chart(source).encode(x='date:T')
columns = sorted(source.symbol.unique())
selection = alt.selection_single(
    fields=['date'], nearest=True, on='mouseover', empty='none', clear='mouseout'

lines = base.mark_line().encode(y='price:Q', color='symbol:N')
points = lines.mark_point().transform_filter(selection)

rule = base.transform_pivot(
    'symbol', value='price', groupby=['date']
    opacity=alt.condition(selection, alt.value(0.3), alt.value(0)),
    tooltip=[alt.Tooltip(c, type='quantitative') for c in columns]

lines + points + rule

Transform Options

The transform_pivot() method is built on the PivotTransform class, which has the following options:

Property Type Description
groupby array(FieldName) The optional data fields to group by. If not specified, a single group containing all data objects will be used.
limit number An optional parameter indicating the maximum number of pivoted fields to generate. The default (0) applies no limit. The pivoted pivot names are sorted in ascending order prior to enforcing the limit. Default value: 0
op string The aggregation operation to apply to grouped value field values. Default value: sum
pivot FieldName The data field to pivot on. The unique values of this field become new field names in the output stream.
value FieldName The data field to populate pivoted fields. The aggregate values of this field become the values of the new pivoted fields.