Reputation: 198
I am trying to achieve a custom sort order for a series of categories in a grouped bar chart.
An issue on GitHub from a few years ago suggests how to do this: by supplying a list with the custom sort
order to alt.Column
. Supplying a list with a sort answer is also mentioned in Joel Ostblom's answer to this related Q.
Omitting the sort
parameter from alt.Column()
results in an alphabetical sorting:
Supplying a custom sort order results in no ordering / equivalent to sort=None
I think, since the columns appear in the same order as they appear in the data:
sort=["easy to use", "easy to understand", "complete", "representative",
"visually appealing", "b&l useful", "rating"]
Where am I going wrong in applying sort order?
My data is shaped as follows:
timestamp value attribute system
0 23/03/2024 16:28:33 4.0 representative A
1 23/03/2024 16:28:33 5.0 representative B
2 23/03/2024 16:28:33 4.0 representative C
3 23/03/2024 16:28:33 3.0 easy to use A
4 23/03/2024 16:28:33 5.0 easy to use B
... ... ... ... ...
99 23/03/2024 19:24:41 5.0 b&l useful B
100 23/03/2024 19:24:41 5.0 b&l useful C
101 23/03/2024 19:24:41 3.0 rating A
102 23/03/2024 19:24:41 5.0 rating B
103 23/03/2024 19:24:41 5.0 rating C
The attributes I am grouping by are:
d1['attribute'].unique()
array(['representative', 'easy to use', 'easy to understand', 'complete',
'visually appealing', 'b&l useful', 'rating'], dtype=object)
and I create the bar plot via:
alt.Chart(d1).mark_bar().encode(
alt.Y('mean(value):Q', scale=alt.Scale(domain=[1,5]),),
alt.X('system:N', axis=alt.Axis(labelAngle=-0, title=None,),),
alt.Column(
'attribute:N',
sort=["easy to use", "easy to understand", "complete", "representative",
"visually appealing", "b&l useful", "rating"],
header=alt.Header(title=None, labelOrient='bottom'),
),
color='system:N',
tooltip=['mean(value)', 'attribute', 'system'],
)
Upvotes: 1
Views: 312
Reputation: 942
Doing the aggregation in a transform does work. Example using different data:
import altair as alt
from vega_datasets import data
source = data.barley()
base = alt.Chart(source).mark_bar().encode(
alt.Column('site:N',
sort=["Waseca", "Morris", "University Farm",
"Grand Rapids", "Crookston", "Duluth"]
),
x='year:O',
color='year:N',
)
chart1 = base.encode(y=alt.Y('sum(yield):Q'))
chart2 = base.transform_aggregate(y2="sum(yield)",groupby=['site', 'year']).encode(y=alt.Y('y2:Q'))
chart1 | chart2
Upvotes: 1
Reputation: 198
xOffset
for a visually-similar resultInstead of using a Column grouping, it is possible to use xOffset (one of several channels available in Altair charts) to achieve a similar outcome.
alt.Chart(d1).mark_bar().encode(
alt.Y('mean(value):Q', scale=alt.Scale(domain=[1,5]),),
alt.X('attribute:N',
axis=alt.Axis(labelAngle=-0, title=None,),
sort=["easy to use", "easy to understand", "complete", "representative",
"visually appealing", "b&l useful", "rating"],
),
alt.XOffset('system:N'),
color='system:N',
tooltip=['mean(value)', 'attribute', 'system', 'count()'],
)
The sort
parameter is moved to the x-axis, where it produces the correct result- a custom sort order based on the supplied list.
It is also possible to reorder the data itself and let the sort be based on source data order.
As discussed on the related issue, there is a bug in Vega-Lite, which Atair is a python wrapper for. Quoting Joel Ostblom from there:
After some exploring, it seems to be an issue with aggregating and sorting facets at the same time. I had no idea this was an existing bug, but it has been reported in VL already here vega/vega-lite#5366; you might be able to ask for workaround there and thumbs up that issue.
The existing bug notes that Vega-Lite does not produce the correct ordering when supplied a custom order.
Upvotes: 1