Reputation: 449
So I have a dataframe where I want to count all the days a student was present. The dataframe headers are the days of the month and I want to count the frequency of the character 'P'
row wise over all the columns and store them in a new column. What I have done unti now is defined a function which should accept each row and count the frequency of P -
def count_P(list):
frequency = 0
for item in list:
if item == 'P':
frequency += 1
return frequency
And then I am trying to apply this function which is what I am confused about:
df['Attendance'] = df.apply(lambda x: count_P(x) for x in , axis = 1)
In the above line I need to pass x everytime as a row of the dataframe so do I write
for x in range(df.iloc[0],df.iloc[df.shape[0]])
? But that gives me SyntaxError
. And do I need axis here? Or does it need to be done in some other way?
Edit: The error message I am getting-
df['Attendance'] = df.apply(lambda x: count_P(x) for x in range(df.iloc[0],df.iloc[df.shape[0]]),axis=1)
^
SyntaxError: Generator expression must be parenthesized
Upvotes: 0
Views: 687
Reputation: 7510
Assuming your dataframe looks like this:
df = pd.DataFrame({'2021-03-01': ['P','P'], '2021-03-02': ['P','X']})
You can do :
df["p_count"] = (df == 'P').sum(axis=1)
yields:
2021-03-01 2021-03-02 p_count
0 P P 2
1 P X 1
Upvotes: 4