Mike
Mike

Reputation: 2751

How do I calculate mean on filtered rows of a pandas dataframe and append means to all columns of original dataframe?

How can I calculate all column's mean to ONLY rows that aren't equal to zero and append a new row at the bottom with the averages with only one line of code? It doesn't have to be one line, but I'm wondering why this doesn't work?

The code below ignores the (df.bar != 0) piece

df = df.append(df[(df.bar != 0)].mean(numeric_only=True), ignore_index=True)

Example df:

    foo     bar     total
0   foo1    bar1    293.09
1   foo2    0       0
2   foo3    bar3    342.3

Current Result:

0   foo     bar     total
1   foo1    bar1    293.09
2   foo2    0       0
3   foo3    bar3    342.3
4                   211.796

Desired Result:

0   foo     bar     total
1   foo1    bar1    293.09
2   foo2    0       0
3   foo3    bar3    342.3
4                   317.695

Upvotes: 1

Views: 975

Answers (1)

jezrael
jezrael

Reputation: 863301

As John Galt commented need '0' because 0 is string:

df = df.append(df[(df.bar != '0')].mean(numeric_only=True), ignore_index=True)
print (df)
    foo   bar    total
0  foo1  bar1  293.090
1  foo2     0    0.000
2  foo3  bar3  342.300
3   NaN   NaN  317.695

If need remove NaNs in last row only use reindex with parameter fill_value:

s = df[(df.bar != '0')].mean(numeric_only=True).reindex(df.columns, fill_value='')
df = df.append(s, ignore_index=True)
print (df)
    foo   bar    total
0  foo1  bar1  293.090
1  foo2     0    0.000
2  foo3  bar3  342.300
3              317.695

Another solution - setting with enlargement:

df.loc[len(df.index)] = s
print (df)
    foo   bar    total
0  foo1  bar1  293.090
1  foo2     0    0.000
2  foo3  bar3  342.300
3              317.695

Upvotes: 4

Related Questions