Allan
Allan

Reputation: 27

Pandas mean() of column ignoring nan

I have a pandas dataframe in which each row has a numpy array. It looks something like this:

 |   Column 1     |
 |----------------|
 |   [nan, 4, 5]  |
 |   [3, 2, 6]    |
 |   [3, 3, 4].   |

I'm trying to get the average of those arrays, using:

Avg = df['Column1'].mean()

Even though ".mean()" skips nan by default, this is not the case here. Since the row isn't actually empty and just one value from the array is missing, I get the following result:

print(Avg)
> [nan, 3, 5]

How can I ignore the missing value from the first row? Ideally, this is what I am trying to achieve:

print(Avg)
> [3, 3, 5]

*Note that the first average should be (3+3)/2, not (3+3)/3. So filling the arrays with zeros is not an option.

Upvotes: 1

Views: 2207

Answers (1)

BENY
BENY

Reputation: 323306

Try with

pd.DataFrame(df['Column 1'].tolist()).mean().tolist()
Out[127]: [3.0, 3.0, 5.0]

Upvotes: 1

Related Questions