Reputation: 585
I'm plotting a line graph in Altair (4.1.0) and would like to use direct labeling (annotations) instead of a regular legend.
As such, the text mark for each line (say, time series) should appear only once and at the right-most point of the x-axis (as opposed to this scatter plot example labeling every data point).
While I'm able to use pandas to manipulate the data to get the desired results, I think it would be more elegant to have a pure-Altair implementation, but I can't seem to get it right.
For example, given the following data:
import numpy as np
import pandas as pd
import altair as alt
np.random.seed(10)
time = pd.date_range(start="10/21/2020", end="10/22/2020", periods=n)
data = pd.concat([
pd.DataFrame({
"time": time,
"group": "One",
"value": np.random.normal(10, 2, n)}),
pd.DataFrame({
"time": time,
"group": "Two",
"value": np.random.normal(5, 2, n)}).iloc[:-1]
], ignore_index=True)
I can generate a satisfactory result using pandas to create a subset that includes the last time-point for each group:
lines = alt.Chart(data).mark_line(
point=True
).encode(
x="time:T",
y="value:Q",
color=alt.Color("group:N", legend=None), # Remove legend
)
text_data = data.loc[data.groupby('group')['time'].idxmax()] # Subset the data for text positions
labels = alt.Chart(text_data).mark_text(
# some adjustments
).encode(
x="time:T",
y="value:Q",
color="group:N",
text="group:N"
)
chart = lines + labels
However, if I try to use the main data and add Altair aggregations, for example using x=max(time)
or explicit transform_aggregate()
, I either get text annotations on all points or none at all (respectively).
Is there a better way to obtain the above result?
Upvotes: 4
Views: 2048
Reputation: 86320
You can do this using an argmax
aggregate in the y encoding. For example, your labels layer might look like this:
labels = alt.Chart(data).mark_text(
align='left', dx=5
).encode(
x='max(time):T',
y=alt.Y('value:Q', aggregate={'argmax': 'time'}),
text='group:N',
color='group:N',
)
Upvotes: 4