ehudk
ehudk

Reputation: 585

Direct labeling a line plot with Altair

I'm plotting a line graph in Altair (4.1.0) and would like to use direct labeling (annotations) instead of a regular legend.
As such, the text mark for each line (say, time series) should appear only once and at the right-most point of the x-axis (as opposed to this scatter plot example labeling every data point).
While I'm able to use pandas to manipulate the data to get the desired results, I think it would be more elegant to have a pure-Altair implementation, but I can't seem to get it right.

For example, given the following data:

import numpy as np
import pandas as pd
import altair as alt

np.random.seed(10)
time = pd.date_range(start="10/21/2020", end="10/22/2020", periods=n)
data = pd.concat([
    pd.DataFrame({
        "time": time,
        "group": "One",
        "value": np.random.normal(10, 2, n)}),
    pd.DataFrame({
        "time": time,
        "group": "Two",
        "value": np.random.normal(5, 2, n)}).iloc[:-1]
], ignore_index=True)

I can generate a satisfactory result using pandas to create a subset that includes the last time-point for each group:

lines = alt.Chart(data).mark_line(
    point=True
).encode(
    x="time:T",
    y="value:Q",
    color=alt.Color("group:N", legend=None),  # Remove legend
)

text_data = data.loc[data.groupby('group')['time'].idxmax()]  # Subset the data for text positions
labels = alt.Chart(text_data).mark_text(
    # some adjustments
).encode(
    x="time:T",
    y="value:Q",
    color="group:N",
    text="group:N"
)

chart = lines + labels

enter image description here

However, if I try to use the main data and add Altair aggregations, for example using x=max(time) or explicit transform_aggregate(), I either get text annotations on all points or none at all (respectively).

Is there a better way to obtain the above result?

Upvotes: 4

Views: 2048

Answers (1)

jakevdp
jakevdp

Reputation: 86320

You can do this using an argmax aggregate in the y encoding. For example, your labels layer might look like this:

labels = alt.Chart(data).mark_text(
    align='left', dx=5
).encode(
    x='max(time):T',
    y=alt.Y('value:Q', aggregate={'argmax': 'time'}),
    text='group:N',
    color='group:N',
)

enter image description here

Upvotes: 4

Related Questions