pandas question: Remove missing values by column

Question

I have a dataframe called teams. Each column is a team in the NFL, each row is how much a given fan would pay to attend a team's game. Looks like:

team1	team2	team3
40	NaN	50
NaN	NaN	80
75	30	NaN

I want to compare the standard deviations of each column, so obviously I need to remove the NaNs. I want to do this column-wise though, so that I don't just remove all rows where one value is NaN because I'll lose a lot of data. What's the best way to do this? I have a lot of columns, otherwise I would just make a numpy array representing each column.

mozway · Accepted Answer

Your assumption is incorrect.

I want to compare the standard deviations of each column, so obviously I need to remove the NaNs

By default std ignores the NaN (skipna=True), so just use:

df.std()

Output:

team1    24.748737
team2          NaN
team3    21.213203
dtype: float64

pandas question: Remove missing values by column

Answers (2)

Related Questions