tonysepia
tonysepia

Reputation: 3500

Altair chart - Custom axis formatter function

The following expression is available to me in Matplotlib:

def format_func(value, tick_number):
    d = datetime.date(1998,1,1) + datetime.timedelta(value)
    return d.strftime("%B")

ax.xaxis.set_major_formatter(plt.FuncFormatter(format_func))

When the X-Axis value contains a day of the year (1-365), it is transformed into name of the corresponding month.

enter image description here

Can I achieve the same in Altair?

enter image description here

EDIT:

Thanks to @joelostblom. Adding a piece of code to specify the exact dataframe that I am using. The current problem with the code below is that in my example I have the

# pre-existing dataframe
num_days = 365 * 4 # four days
df = pd.DataFrame(
    {
    'Timestamp': [
        (datetime.datetime.now() - datetime.timedelta(num_days) ) + datetime.timedelta(days=x) 
        for x in range(num_days)
    ],
    'value': pd.Series(np.random.randn(num_days))
    }
)
df = df.set_index('Timestamp')

# extra columns that may be needed by altair
df['Month'] = df.index.month_name()
df['Year'] = df.index.year

So far I have tried using Altair in two ways:

alt.Chart(df.reset_index()).mark_line().encode(
    x='Month',
    y='value',
    color=alt.Color('Year:O', scale=alt.Scale(scheme='category10')),
)

enter image description here

or like you suggested:

alt.Chart(df.reset_index()).mark_line().encode(
    x=alt.X('Timestamp', axis=alt.Axis(format='%b')),
    y='value',
    color=alt.Color('Year:O', scale=alt.Scale(scheme='category10')),
)

enter image description here

Could you please help me understand what I am doing wrong?

Upvotes: 1

Views: 1158

Answers (1)

joelostblom
joelostblom

Reputation: 48889

You could convert the day of year do a datestamp using pandas.to_datetime:

import altair as alt
import numpy as np
import pandas as pd


# Setup data
x = np.arange(365)
source = pd.DataFrame({
  'x': x,
  'f(x)': np.sin(x / 50)
})

# Convert to date
source['date'] = pd.to_datetime(source['x'], unit='D', origin='2020')

alt.Chart(source).mark_line().encode(
    x='date',
    y='f(x)'
)

enter image description here

If you want all the month names on the axis, you can change the axis formatting:

alt.Chart(source).mark_line().encode(
    x=alt.X('date', axis=alt.Axis(format='%b')),
    y='f(x)'
)

enter image description here

For your updated example when there are multiple years, you can used the time unit aggregations in Altair/Vega-Lite. For example, using 'monthdate(Timestamp)' will disregard the year and aggregate by the month and day of the month (note that not all of the aggregations in the Vega-Lite docs are available in Altair yet).

import pandas as pd
import altair as alt 
import datetime
import numpy as np


# pre-existing dataframe
num_days = 365 * 4 # four years
df = pd.DataFrame(
    {
    'Timestamp': [
        (datetime.datetime.now() - datetime.timedelta(num_days) ) + datetime.timedelta(days=x) 
        for x in range(num_days)
    ],
    'value': pd.Series(np.random.randn(num_days))
    }
)

alt.Chart(df).mark_line().encode(
    x=alt.X('monthdate(Timestamp)', axis=alt.Axis(format='%b')),
    y='value',
    color=alt.Color('year(Timestamp):O', scale=alt.Scale(scheme='category10')),
)

enter image description here

Upvotes: 1

Related Questions