Reputation: 1970
The code below creates a regression line; however, the legend defaults to labeling the line as "undefined." How can this regression line be labeled in the legend as "reg-line"?
import altair as alt
from vega_datasets import data
import pandas as pd
source = data.anscombe().copy()
source['line-label'] = 'x=y'
source = pd.concat([source,source.groupby('Series').agg(x_diff=('X','diff'), y_diff=('Y','diff'))],axis=1)
source['rate'] = source.y_diff/source.x_diff
source['rate-label'] = 'line y=x'
scatter = alt.Chart(source).mark_circle(size=60, opacity=0.60).encode(
x='X:Q',
y='Y:Q',
color='Series:N',
tooltip=['X','Y','rate']
)
scatter = scatter + scatter.transform_regression('X', 'Y').mark_line(opacity=0.50, shape='mark')
chart = scatter.facet(
columns=2
, facet=alt.Facet('Series',header=alt.Header(labelFontSize=25))
).resolve_scale(
x='independent',
y='independent'
)
chart.display()
Upvotes: 1
Views: 1240
Reputation: 328
Simply add .transform_fold(["reg-line"], as_=["Regression", "y"]).encode(alt.Color("Regression:N"))
after mark line
Code should look like
import altair as alt
from vega_datasets import data
import pandas as pd
source = data.anscombe().copy()
source['line-label'] = 'x=y'
source = pd.concat([source,source.groupby('Series').agg(x_diff=('X','diff'), y_diff=('Y','diff'))],axis=1)
source['rate'] = source.y_diff/source.x_diff
source['rate-label'] = 'line y=x'
scatter = alt.Chart(source).mark_circle(size=60, opacity=0.60).encode(
x='X:Q',
y='Y:Q',
color='Series:N',
tooltip=['X','Y','rate']
)
scatter = scatter + scatter.transform_regression('X', 'Y').mark_line(
opacity=0.50,
shape='mark'
).transform_fold(
["reg-line"],
as_=["Regression", "y"]
).encode(alt.Color("Regression:N"))
chart = scatter.facet(
columns=2
, facet=alt.Facet('Series',header=alt.Header(labelFontSize=25))
).resolve_scale(
x='independent',
y='independent'
)
chart.display()
Upvotes: 5