Filter Transform

The filter transform removes objects from a data stream based on a provided filter expression, selection, or other filter predicate. A filter can be added at the top level of a chart using the Chart.transform_filter() method. The argument to transform_filter can be one of a number of expressions and objects:

  1. A Vega expression expressed as a string or built using the expr module

  2. A Field predicate, such as FieldOneOfPredicate, FieldRangePredicate, FieldEqualPredicate, FieldLTPredicate, FieldGTPredicate, FieldLTEPredicate, FieldGTEPredicate,

  3. A Selection predicate or object created by selection()

  4. A Logical operand that combines any of the above

We’ll show a brief example of each of these in the following sections

Filter Expression

A filter expression uses the Vega expression language, either specified directly as a string, or built using the expr module. This can be useful when, for example, selecting only a subset of data.

For example:

import altair as alt
from altair import datum

from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_area().encode(
    x='age:O',
    y='people:Q',
).transform_filter(
    (datum.year == 2000) & (datum.sex == 1)
)

Notice that, like in the Filter Transform, data values are referenced via the name datum.

Field Predicates

Field predicates overlap somewhat in function with expression predicates, but have the advantage that their contents are validated by the schema. Examples are:

Here is an example of a FieldEqualPredicate used to select just the values from year 2000 as in the above chart:

import altair as alt
from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).transform_filter(
    alt.FieldEqualPredicate(field='year', equal=2000)
)

A FieldOneOfPredicate is similar, but allows selection of any number of specific values:

import altair as alt
from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).transform_filter(
    alt.FieldOneOfPredicate(field='year', oneOf=[1900, 1950, 2000])
)

Finally, a FieldRangePredicate() allows selecting values within a particular continuous range:

import altair as alt
from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).transform_filter(
    alt.FieldRangePredicate(field='year', range=[1960, 2000])
)

Selection Predicates

Selection predicates can be used to filter data based on a selection. While these can be constructed directly using a SelectionPredicate class, in Altair it is often more convenient to construct them using the selection() function. For example, this chart uses a multi-selection that allows the user to click or shift-click on the bars in the bottom chart to select the data to be shown in the top chart:

import altair as alt
from vega_datasets import data
pop = data.population.url

selection = alt.selection_multi(fields=['year'])

top = alt.Chart().mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).properties(
    width=600, height=200
).transform_filter(
    selection
)

bottom = alt.Chart().mark_bar().encode(
    x='year:O',
    y='sum(people):Q',
    color=alt.condition(selection, alt.value('steelblue'), alt.value('lightgray'))
).properties(
    width=600, height=100
).add_selection(
    selection
)

alt.vconcat(
    top, bottom,
    data=pop
)

Logical Operands

At times it is useful to combine several types of predicates into a single selection. This can be accomplished using the various logical operand classes:

These are not yet part of the Altair interface (see Issue 695) but can be constructed explicitly; for example, here we plot US population distributions for all data except the years 1950-1960, by applying a LogicalNotPredicate schema to a FieldRangePredicate:

import altair as alt
from vega_datasets import data

pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).properties(
    width=600, height=200
).transform_filter(
    {'not': alt.FieldRangePredicate(field='year', range=[1950, 1960])}
)

Transform Options

The transform_filter() method is built on the FilterTransform class, which has the following options:

Property

Type

Description

filter

PredicateComposition

The filter property must be a predication definition, which can take one of the following forms:

  1. an expression string, where datum can be used to refer to the current data object. For example, {filter: "datum.b2 > 60"} would make the output data includes only items that have values in the field b2 over 60.

  2. one of the field predicates: equal, lt, lte, gt, gte, range, oneOf, or valid,

  3. a selection predicate, which define the names of a selection that the data point should belong to (or a logical composition of selections).

  4. a logical composition of (1), (2), or (3).