MatmataHi
MatmataHi

Reputation: 309

Filtering data on Titanic dataset

I am using the Titanic dataset to make some filters on the data. I need to find the most youngest passengers who didn't survived. I have got this result by now:

df_kids = df[(df["Survived"] == 0)][["Age","Name","Sex"]].sort_values("Age").head(10)
df_kids

Now I need to say how many of them are male and how female. I have tried a loop but it's giving me zero for both lists all the time. I don't know what I am doing wrong:

list_m = list()
list_f = list()

for i in df_kids:
    if [["Sex"] == "male"]:
        list_m.append(i)
    else:
        list_f.append(i)
len(list_m)
len(list_f)

Could you help me, please?

Thanks a lot!

Upvotes: 0

Views: 351

Answers (1)

ThePyGuy
ThePyGuy

Reputation: 18426

You can create a masking. For example:

male_mask = df_kids['Sex' == 'male']

And use it:

male = df_kids[male_mask]
female = df_kids[~male_mask]  # Assuming Sex is either male or female

You can use the shape attribute now if you are interested on counts only.

print(male.shape[0])
print(female.shape[0])

Upvotes: 1

Related Questions