Defining Data

Each top-level chart object, including Chart, LayeredChart, and FacetedChart, can take a dataset as its first argument. The dataset can be specified in one of three ways:

For example, here we specify data via a dataframe:

from altair import Chart
import pandas as pd

data = pd.DataFrame({'x': ['A', 'B', 'C', 'D', 'E'],
                     'y': [5, 3, 6, 7, 2]})
Chart(data).mark_bar().encode(
    x='x',
    y='y',
)

When data is specified as a DataFrame, the encoding is quite simple, as Altair uses the data type information provided by Pandas to automatically determine the data types required in the encoding.

By comparison, here we create the same chart using a Data object, with the data specified as a JSON-style list of records:

from altair import Chart, Data

data = Data(values=[{'x': 'A', 'y': 5},
                    {'x': 'B', 'y': 3},
                    {'x': 'C', 'y': 6},
                    {'x': 'D', 'y': 7},
                    {'x': 'E', 'y': 2}])
Chart(data).mark_bar().encode(
    x='x:O',  # specify ordinal data
    y='y:Q',  # specify quantitative data
)

notice the extra markup required in the encoding; because Altair cannot infer the types within a Data object, we must specify them manually (here we use Encoding Shorthands to specify ordinal (O) for x and quantitative (Q) for y; see Data Types below).

Similarly, we must also specify the data type when referencing data by URL:

from altair import Chart

url = 'https://vega.github.io/vega-datasets/data/cars.json'

Chart(url).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q'
)

We will further discuss encodings and associated types in Encodings, next.