Reputation: 5126
I have a task to create Dataframes based on conditions within other Dataframes.
I've been doing it the same way for about a week now, but I was curious if there was a better way. I stumbled across This Example. Now i know the example he is using is creating a separate column based on conditions, but it made me wonder if my code could be improved.
Here is a shortened version of the code in link for ease of use:
import pandas as pd
import numpy as np
raw_data = {'student_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze', 'Jacon', 'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'],
'test_score': [76, 88, 84, 67, 53, 96, 64, 91, 77, 73, 52, np.NaN]}
df = pd.DataFrame(raw_data, columns = ['student_name', 'test_score'])
print(df)
grades = []
for row in df['test_score']:
if row > 59:
grades.append('Pass')
else:
grades.append('fail')
df['grades'] = grades
print(df)
student_name test_score grades
0 Miller 76.0 Pass
1 Jacobson 88.0 Pass
2 Ali 84.0 Pass
3 Milner 67.0 Pass
4 Cooze 53.0 fail
5 Jacon 96.0 Pass
6 Ryaner 64.0 Pass
7 Sone 91.0 Pass
8 Sloan 77.0 Pass
9 Piger 73.0 Pass
10 Riani 52.0 fail
11 Ali NaN fail
Going along with the above example, if i did not want to make a "Grades" Column, but instead wanted a dataframe of all the people who passed. I personally would do this:
pass_df = df[df['test_score'] > 59]
print(pass_df)
Is there a better way of doing this?
Upvotes: 2
Views: 3707
Reputation: 29690
The new column can be assigned more nicely using np.where
.
df['grades'] = np.where(df.test_score > 59, 'Pass', 'fail')
As for indexing where the test score is greater than 59 your approach is standard, however should you intend on treating the result as its own DataFrame you will want to call .copy()
.
Demo
>>> df['grades'] = np.where(df.test_score > 59, 'Pass', 'fail')
>>> df
student_name test_score grades
0 Miller 76.0 Pass
1 Jacobson 88.0 Pass
2 Ali 84.0 Pass
3 Milner 67.0 Pass
4 Cooze 53.0 fail
5 Jacon 96.0 Pass
6 Ryaner 64.0 Pass
7 Sone 91.0 Pass
8 Sloan 77.0 Pass
9 Piger 73.0 Pass
10 Riani 52.0 fail
11 Ali NaN fail
Upvotes: 3