Reputation: 47
Here's the challenge: make a dataframe from the shipwreck.csv file. From this dataframe, build a pivot table that shows the average fares for males/females in each class, and the number of surviving males/females in each class. The row index should be the class values. Use margins to include averages for all males, females, and all passengers in each class. Print the entire frame.Then Create a bar plot that shows the survival percentage of males and females and all passengers on a per class basis. Use data from the pivot table in the previous problem. The width of the bars should be .25.
My issue is I have the dataframe built with only those specified columns but I don't understand how to get the dataframe pivot table and finding the average fare for males/females to be able to set up the graph.
Here's my code so far:
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
matplotlib.rcParams['figure.figsize'] = (10.0, 4.0)
df = pd.read_csv("shipwreck.csv",usecols=
['survived','sex','fare','class'])
df.set_index('survived')
print(df)
#pivot table to get average fares for male/female then plot it
#use bar graph w/ width of.25 for bars
heres what the .csv shows from dataframe:
survived sex fare class
0 0 male 7.2500 Third
1 1 female 71.2833 First
2 1 female 7.9250 Third
3 1 female 53.1000 First
4 0 male 8.0500 Third
5 0 male 8.4583 Third
6 0 male 51.8625 First
7 0 male 21.0750 Third
8 1 female 11.1333 Third
9 1 female 30.0708 Second
10 1 female 16.7000 Third
11 1 female 26.5500 First
12 0 male 8.0500 Third
13 0 male 31.2750 Third
14 0 female 7.8542 Third
15 1 female 16.0000 Second
16 0 male 29.1250 Third
17 1 male 13.0000 Second
18 0 female 18.0000 Third
19 1 female 7.2250 Third
20 0 male 26.0000 Second
21 1 male 13.0000 Second
22 1 female 8.0292 Third
23 1 male 35.5000 First
24 0 female 21.0750 Third
25 1 female 31.3875 Third
26 0 male 7.2250 Third
27 0 male 263.0000 First
28 1 female 7.8792 Third
29 0 male 7.8958 Third
.. ... ... ... ...
861 0 male 11.5000 Second
862 1 female 25.9292 First
863 0 female 69.5500 Third
864 0 male 13.0000 Second
865 1 female 13.0000 Second
866 1 female 13.8583 Second
867 0 male 50.4958 First
868 0 male 9.5000 Third
869 1 male 11.1333 Third
870 0 male 7.8958 Third
871 1 female 52.5542 First
872 0 male 5.0000 First
873 0 male 9.0000 Third
874 1 female 24.0000 Second
875 1 female 7.2250 Third
876 0 male 9.8458 Third
877 0 male 7.8958 Third
878 0 male 7.8958 Third
879 1 female 83.1583 First
880 1 female 26.0000 Second
881 0 male 7.8958 Third
882 0 female 10.5167 Third
883 0 male 10.5000 Second
884 0 male 7.0500 Third
885 0 female 29.1250 Third
886 0 male 13.0000 Second
887 1 female 30.0000 First
888 0 female 23.4500 Third
889 1 male 30.0000 First
890 0 male 7.7500 Third
[891 rows x 4 columns]
heres what the bar graph should look like:
Upvotes: 2
Views: 1146
Reputation: 3902
Here's what you can do:
df = pd.read_csv('shipwreck.csv', usecols=['survived', 'sex', 'class'])
df_piv = pd.pivot_table(df,
index='class',
columns='sex',
aggfunc=lambda x: 100*x.sum()/x.count(), # % per group
margins=True,
margins_name='Combined')
df_piv.columns = df_piv.columns.droplevel()
df_piv.plot.bar(rot='horizontal');
Upvotes: 3