Reputation: 4787
I am using the following code to fill the NaN
values and then adding a column to the DataFrame
which would contain the number of values in a row which are greater than 0. Here's the code:
df.fillna(0, inplace=True)
dfMin10 = df
dfMin10['Sum'] = (dfMin10.iloc[1:len(dfMin10.columns)] > 0).sum(1)
dfMin10
When I see the column Sum
, I still see some NaN
values. Why would this be? I'm assuming my DataFrame (df)
also has some NaN
values even after replacing NaN
.
Any pointers would be highly appreciated.
Upvotes: 2
Views: 635
Reputation: 761
Are you seeing NaN
in the first sum
entry? This line:
branchConceptsWithScoresMin10['Sum'] = (branchConceptsWithScoresMin10.iloc[1:len(branchConceptsWithScoresMin10.columns)] > 0).sum(1)
Should this be:
branchConceptsWithScoresMin10['Sum'] = (branchConceptsWithScoresMin10.iloc[0:len(branchConceptsWithScoresMin10.columns)] > 0).sum(1)
Note the indexing starting from 0
.
Example:
df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'])
df.fillna(0, inplace=True)
branchConceptsWithScoresMin10 = df
# Your original code
branchConceptsWithScoresMin10['Sum'] = (branchConceptsWithScoresMin10.iloc[1:len(branchConceptsWithScoresMin10.columns)] > 0).sum(1)
# This should return
# a b c d Sum
# x 0 0 0 0 NaN
# y 0 0 0 0 0.0
# z 0 0 0 0 0.0
branchConceptsWithScoresMin10['Sum'] = (branchConceptsWithScoresMin10.iloc[0:] > 0).sum(1)
# There should not be any NaNs here.
Upvotes: 3