Filter#

The filter transform removes objects from a data stream based on a provided filter expression, selection, or other filter predicate. A filter can be added at the top level of a chart using the Chart.transform_filter() method. The argument to transform_filter can be one of a number of expressions and objects:

  1. A Vega expression expressed as a string or built using the expr module

  2. A Field predicate, such as FieldOneOfPredicate, FieldRangePredicate, FieldEqualPredicate, FieldLTPredicate, FieldGTPredicate, FieldLTEPredicate, FieldGTEPredicate,

  3. A Selection predicate or object created by selection()

  4. A Logical operand that combines any of the above

We’ll show a brief example of each of these in the following sections

Filter Expression#

A filter expression uses the Vega expression language, either specified directly as a string, or built using the expr module. This can be useful when, for example, selecting only a subset of data.

For example:

import altair as alt
from altair import datum

from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_area().encode(
    x='age:O',
    y='people:Q',
).transform_filter(
    (datum.year == 2000) & (datum.sex == 1)
)

Notice that, like in the Filter, data values are referenced via the name datum.

Field Predicates#

Field predicates overlap somewhat in function with expression predicates, but have the advantage that their contents are validated by the schema. Examples are:

Here is an example of a FieldEqualPredicate used to select just the values from year 2000 as in the above chart:

import altair as alt
from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).transform_filter(
    alt.FieldEqualPredicate(field='year', equal=2000)
)

A FieldOneOfPredicate is similar, but allows selection of any number of specific values:

import altair as alt
from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).transform_filter(
    alt.FieldOneOfPredicate(field='year', oneOf=[1900, 1950, 2000])
)

Finally, a FieldRangePredicate() allows selecting values within a particular continuous range:

import altair as alt
from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).transform_filter(
    alt.FieldRangePredicate(field='year', range=[1960, 2000])
)

Selection Predicates#

Selection predicates can be used to filter data based on a selection. While these can be constructed directly using a SelectionPredicate class, in Altair it is often more convenient to construct them using the selection() function. For example, this chart uses a multi-selection that allows the user to click or shift-click on the bars in the bottom chart to select the data to be shown in the top chart:

import altair as alt
from vega_datasets import data
pop = data.population.url

selection = alt.selection_point(fields=['year'])

top = alt.Chart(width=600, height=200).mark_line().encode(
    x="age:O",
    y="sum(people):Q",
    color="year:O"
).transform_filter(
    selection
)

color = alt.when(selection).then(alt.value("steelblue")).otherwise(alt.value("lightgray"))
bottom = alt.Chart(width=600, height=100).mark_bar().encode(
    x="year:O",
    y="sum(people):Q",
    color=color
).add_params(
    selection
)

alt.vconcat(top, bottom, data=pop)

Logical Operands#

At times it is useful to combine several types of predicates into a single selection. We can use &, | and ~ for respectively AND, OR and NOT logical composition operands.

For example, here we wish to plot US population distributions for all data except the years 1950-1960.

First, we use a FieldRangePredicate to select 1950-1960:

import altair as alt
from vega_datasets import data

source = data.population.url
chart = alt.Chart(source).mark_line().encode(
    x="age:O",
    y="sum(people):Q",
    color="year:O"
).properties(
    width=600, height=200
)

between_1950_60 = alt.FieldRangePredicate(field="year", range=[1950, 1960])

Then, we can invert this selection using ~:

# NOT between 1950-1960
chart.transform_filter(~between_1950_60)

We can further refine our filter by composing multiple predicates together. In this case, using datum:

chart.transform_filter(~between_1950_60 & (datum.age <= 70))

When passing multiple predicates they will be reduced with &:

chart.transform_filter(datum.year > 1980, datum.age != 90)

Using keyword-argument constraints can simplify our first example in Filter Expression:

alt.Chart(source).mark_area().encode(
    x="age:O",
    y="people:Q",
).transform_filter(year=2000, sex=1)

Transform Options#

The transform_filter() method is built on the FilterTransform class, which has the following options:

Click to show table

Property

Type

Description

filter

PredicateComposition

The filter property must be a predication definition, which can take one of the following forms:

  1. an expression <https://vega.github.io/vega-lite/docs/types.html#expression>__ string, where datum can be used to refer to the current data object. For example, {filter: "datum.b2 > 60"} would make the output data includes only items that have values in the field b2 over 60.

  2. one of the field predicates <https://vega.github.io/vega-lite/docs/predicate.html#field-predicate>: equal <https://vega.github.io/vega-lite/docs/predicate.html#field-equal-predicate>, lt <https://vega.github.io/vega-lite/docs/predicate.html#lt-predicate>, lte <https://vega.github.io/vega-lite/docs/predicate.html#lte-predicate>, gt <https://vega.github.io/vega-lite/docs/predicate.html#gt-predicate>, gte <https://vega.github.io/vega-lite/docs/predicate.html#gte-predicate>, range <https://vega.github.io/vega-lite/docs/predicate.html#range-predicate>, oneOf <https://vega.github.io/vega-lite/docs/predicate.html#one-of-predicate>, or valid <https://vega.github.io/vega-lite/docs/predicate.html#valid-predicate>__,

  3. a selection predicate <https://vega.github.io/vega-lite/docs/predicate.html#selection-predicate>__, which define the names of a selection that the data point should belong to (or a logical composition of selections).

  4. a logical composition <https://vega.github.io/vega-lite/docs/predicate.html#composition>__ of (1), (2), or (3).