Manoj Kumar G
Manoj Kumar G

Reputation: 502

how to select a portion of data by a condition in Altair chart

I would like to filter a portion of data I have on a condition. Is it possible with Altair?

I am using the below code to plot a chart.

alt.Chart(deliveries).mark_bar().encode(
    alt.X('batsman', sort=alt.EncodingSortField(field='sum(batsman_runs)', op='count', order='descending')),
    alt.Y('sum(batsman_runs)'),
    tooltip=['batsman', 'sum(batsman_runs)']
).properties(height=600, width=3000).interactive()

But since this has lot of data, there are many bars in my chart. I would like to restrict the bars in my chart by giving a condition like showing data for those batsman who scored above 4000 runs.

I tried using transform_filter(), but is not working with aggregate functions( I am using 'sum' here).

alt.Chart(deliveries).mark_bar().encode(
    alt.X('batsman', sort=alt.EncodingSortField(field='sum(batsman_runs)', op='count', order='descending')),
    alt.Y('sum(batsman_runs)'),
    tooltip=['batsman', 'sum(batsman_runs)']
).properties(height=600, width=3000).interactive().transform_filter(datum.sum(batsman_runs) > 4000)

Is there a way to achieve this functionality of filtering required data by giving a condition?

Upvotes: 0

Views: 3108

Answers (1)

jakevdp
jakevdp

Reputation: 86433

In order to reference an aggregate within a filter transform, it needs to be computed within an aggregate transform rather than in the encoding shorthand.

Something like this should work:

alt.Chart(deliveries).transform_aggregate(
    total_runs='sum(batsman_runs)',
    groupby=['batsman']
).transform_filter(
    "datum.total_runs > 4000"
).mark_bar().encode(
    alt.X('batsman:Q', sort=alt.EncodingSortField(field='total_runs', op='count', order='descending')),
    alt.Y('total_runs:Q'),
    tooltip=['batsman:Q', 'total_runs:Q']
).properties(height=600, width=3000).interactive()

Upvotes: 2

Related Questions