Reputation: 105
I want every element divided by sum of row inplace,code below always go wrong.
pandas newbie, thanks!
df = pd.DataFrame(np.random.rand(12).reshape(3,4),columns=list('abcd'))
df_row_sum = df.apply(lambda x: x.mean(),axis=1)
df / df_row_sum
Upvotes: 9
Views: 9430
Reputation: 294248
Using @jezrael's setup
np.random.seed(123)
df = pd.DataFrame(np.random.randint(10, size=12).reshape(3,4),columns=list('abcd'))
print (df)
a b c d
0 2 2 6 1
1 3 9 6 1
2 0 1 9 0
Use numpy
and reconstruct a new dataframe
v = df.values
pd.DataFrame(
v / v.sum(1, keepdims=True),
df.index, df.columns
)
a b c d
0 0.181818 0.181818 0.545455 0.090909
1 0.157895 0.473684 0.315789 0.052632
2 0.000000 0.100000 0.900000 0.000000
Upvotes: 1
Reputation: 862591
I think you need sum
or maybe mean
per rows (axis=1
) with division by DataFrame.div
:
np.random.seed(123)
df = pd.DataFrame(np.random.randint(10, size=12).reshape(3,4),columns=list('abcd'))
print (df)
a b c d
0 2 2 6 1
1 3 9 6 1
2 0 1 9 0
print (df.sum(axis=1))
0 11
1 19
2 10
dtype: int64
print (df.div(df.sum(axis=1), axis=0))
a b c d
0 0.181818 0.181818 0.545455 0.090909
1 0.157895 0.473684 0.315789 0.052632
2 0.000000 0.100000 0.900000 0.000000
print (df.mean(axis=1))
0 2.75
1 4.75
2 2.50
dtype: float64
print (df.div(df.mean(axis=1), axis=0))
a b c d
0 0.727273 0.727273 2.181818 0.363636
1 0.631579 1.894737 1.263158 0.210526
2 0.000000 0.400000 3.600000 0.000000
Upvotes: 15