Marlieke
Marlieke

Reputation: 1

Problem plotting data in altair, data is swapped in pyramid graph

I am trying to plot the data from my pd file that contains data about man and woman in different function levels. However whilst plotting the pyramid df the data is swapped. PhD and assistant are swapped and associate and postdoc. However I can't find a problem or mistake.

import altair as alt
from vega_datasets import data
import pandas as pd

df_natuur_vrouw = df_natuur[df_natuur['geslacht'] == 'V']
df_natuur_man = df_natuur[df_natuur['geslacht'] == 'M']

df_techniek_vrouw = df_techniek[df_techniek['geslacht'] == 'V']
df_techniek_man = df_techniek[df_techniek['geslacht'] == 'M']

slider = alt.binding_range(min=2011, max=2020, step=1)
select_year = alt.selection_single(name='year', fields=['year'],
                                   bind=slider, init={'year': 2020})

base_vrouw = alt.Chart(df_natuur_vrouw).add_selection(
    select_year
).transform_filter(
    select_year
).properties(
    width=250
)

base_man = alt.Chart(df_natuur_man).add_selection(
    select_year
).transform_filter(
    select_year
).properties(
    width=250
)

color_scale = alt.Scale(domain=['M', 'V'],
                        range=['#003865', '#ee7203'])

left = base_vrouw.encode(
    y=alt.Y('functieniveau:O', axis=None),
    x=alt.X('percentage_afgerond:Q',
            title='percentage',
            scale=alt.Scale(domain=[0, 100], reverse=True)),
    color=alt.Color('geslacht:N', scale=color_scale, legend=None)
).mark_bar().properties(title='Female')

middle = base_vrouw.encode(
    y=alt.Y('functieniveau:O', axis=None, sort=['Professor', 'Associate Professor', 'Assistant Professor', 'Postdoc', 'PhD']),
    text=alt.Text('functieniveau:O'),
).mark_text().properties(width=110)

right = base_man.encode(
    y=alt.Y('functieniveau:O', axis=None),
    x=alt.X('percentage_afgerond:Q', title='percentage', scale=alt.Scale(domain=[0, 100])),
    color=alt.Color('geslacht:N', scale=color_scale, legend=None)
).mark_bar().properties(title='Male')

alt.concat(left, middle, right, spacing=5, title='Percentage male and female employees per academic level in nature sector 2011-2020')

The data I want to show, however the values for PHD and assistant are swapped and so are the values for associate professor and postdoc

Upvotes: 0

Views: 41

Answers (1)

joelostblom
joelostblom

Reputation: 49064

It is a little hard to tell without having a sample of the data to be able to run the code, but the problem is likely that you are sorting the middle plot, but no the left and right plots. Try applying the same Y sort order to the bar plots as you are using for the text and see if that works.

Upvotes: 1

Related Questions