programmernoob
programmernoob

Reputation: 47

Bar plot from pivot table with grand total and percentage per group aggregation

Here's the challenge: make a dataframe from the shipwreck.csv file. From this dataframe, build a pivot table that shows the average fares for males/females in each class, and the number of surviving males/females in each class. The row index should be the class values. Use margins to include averages for all males, females, and all passengers in each class. Print the entire frame.Then Create a bar plot that shows the survival percentage of males and females and all passengers on a per class basis. Use data from the pivot table in the previous problem. The width of the bars should be .25.

My issue is I have the dataframe built with only those specified columns but I don't understand how to get the dataframe pivot table and finding the average fare for males/females to be able to set up the graph.

Here's my code so far:

%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
matplotlib.rcParams['figure.figsize'] = (10.0, 4.0)

df =  pd.read_csv("shipwreck.csv",usecols=     
['survived','sex','fare','class'])
df.set_index('survived')
print(df)
#pivot table to get average fares for male/female then plot it
#use bar graph w/ width of.25 for bars

heres what the .csv shows from dataframe:

             survived     sex      fare   class
        0           0    male    7.2500   Third
        1           1  female   71.2833   First
        2           1  female    7.9250   Third
        3           1  female   53.1000   First
        4           0    male    8.0500   Third
        5           0    male    8.4583   Third
        6           0    male   51.8625   First
        7           0    male   21.0750   Third
        8           1  female   11.1333   Third
        9           1  female   30.0708  Second
        10          1  female   16.7000   Third
        11          1  female   26.5500   First
        12          0    male    8.0500   Third
        13          0    male   31.2750   Third
        14          0  female    7.8542   Third
        15          1  female   16.0000  Second
        16          0    male   29.1250   Third
        17          1    male   13.0000  Second
        18          0  female   18.0000   Third
        19          1  female    7.2250   Third
        20          0    male   26.0000  Second
        21          1    male   13.0000  Second
        22          1  female    8.0292   Third
        23          1    male   35.5000   First
        24          0  female   21.0750   Third
        25          1  female   31.3875   Third
        26          0    male    7.2250   Third
        27          0    male  263.0000   First
        28          1  female    7.8792   Third
        29          0    male    7.8958   Third
        ..        ...     ...       ...     ...
        861         0    male   11.5000  Second
        862         1  female   25.9292   First
        863         0  female   69.5500   Third
        864         0    male   13.0000  Second
        865         1  female   13.0000  Second
        866         1  female   13.8583  Second
        867         0    male   50.4958   First
        868         0    male    9.5000   Third
        869         1    male   11.1333   Third
        870         0    male    7.8958   Third
        871         1  female   52.5542   First
        872         0    male    5.0000   First
        873         0    male    9.0000   Third
        874         1  female   24.0000  Second
        875         1  female    7.2250   Third
        876         0    male    9.8458   Third
        877         0    male    7.8958   Third
        878         0    male    7.8958   Third
        879         1  female   83.1583   First
        880         1  female   26.0000  Second
        881         0    male    7.8958   Third
        882         0  female   10.5167   Third
        883         0    male   10.5000  Second
        884         0    male    7.0500   Third
        885         0  female   29.1250   Third
        886         0    male   13.0000  Second
        887         1  female   30.0000   First
        888         0  female   23.4500   Third
        889         1    male   30.0000   First
        890         0    male    7.7500   Third

        [891 rows x 4 columns]

heres what the bar graph should look like:

enter image description here

Upvotes: 2

Views: 1146

Answers (1)

ayorgo
ayorgo

Reputation: 3902

Here's what you can do:

df = pd.read_csv('shipwreck.csv', usecols=['survived', 'sex', 'class'])
df_piv = pd.pivot_table(df,
                        index='class',
                        columns='sex',
                        aggfunc=lambda x: 100*x.sum()/x.count(), # % per group
                        margins=True,
                        margins_name='Combined')
df_piv.columns = df_piv.columns.droplevel()
df_piv.plot.bar(rot='horizontal');

enter image description here

Upvotes: 3

Related Questions