KMW
KMW

Reputation: 47

Drop NaN's from row that totals one column

I have looked high and low and tried so many different codes from this site to help me with my problem. Maybe someone can make a suggestion?

I have a dataframe that looks like this: image of my dataframe

I hope that table came out right. I'm a newbie to Stack Overflow so sorry if it didn't come out right. I have struggled with this for hours. I managed to finally show my Total row at the bottom, but I don't want the NaN to show in the one column that has strings in it. Can someone tell me what on EARTH does it take to simply remove NaN's from ONE CELL in this dataframe? I'm at my wits end.

Upvotes: 0

Views: 210

Answers (2)

Lukas
Lukas

Reputation: 2312

One of possible solutions (including creation of the dataframe):

import pandas as pd
import numpy as np

# create base of the dataframe
df = pd.DataFrame({'gender':['male', 'female', 'others'], 'total':[484, 81, 11]})
# calculate percentage column
df['percentage'] = round(df['total']/df['total'].sum(), 2)
# create SUM row
df.loc['TOTAL'] = df.select_dtypes(np.number).sum()
# replace string column 'gender' with empty string
df.loc['TOTAL', 'gender'] = ''

Result:

        gender  total   percentage
0       male    484.0   0.84
1       female  81.0    0.14
2       others  11.0    0.02
TOTAL           576.0   1.00

Upvotes: 1

PieCot
PieCot

Reputation: 3639

You can use fillna to fill NaNs with another value, e.g., an empty string:

df['Gender'].fillna('', inplace=True)

Or, if you prefer with 'Other/Not Disclosed':

df['Gender'].fillna('Other/Not Disclosed', inplace=True)

In both cases, when you print the DataFrame, NaN will be not present anymore.

There are other ways to handle NaN or missing values; you can take a look here for more information.

Upvotes: 1

Related Questions