Pandas - dividing each cell by column sum - But it returns the same value

Question

I encountered a very strange (and frustrating) issue with Pandas. I want to divide each cell in the dataframe by the sum of the column. I have already googled and used the answer suggested but it doesn't work - the contents of each row returns the SAME VALUE.

dfs = pd.DataFrame(np.random.randint(0,10,size=(3,3)), columns=['A','B','C'])
# Now here is the copied solution from google
dfs = dfs.div(dfs.sum(axis=0),axis=1)

So for easy examples like above it works very well. But the moment I tried it on my dataframe, which has 1080 columns, every row has the same value.

I have made sure to drop all nan, inf, or anything other than numbers, and the dtype for all the columns is float64. I am not sure why this is happening, could anyone give me some ideas what is wrong? I have a feeling that it is because of the size of the dataframe? But surely 1080 columns and 8 rows shouldn't be too much for Pandas to handle?

Thanks in advance

Edit: Yes, run this code to get the first 2 columns of my dataframe.

dfs = pd.DataFrame({'7006091':[2.219749271,2.15577658,1.857604216,
1.588101736,
0.925926932,
1.413871811,
1.528702513,
1.313778722
],'7007772':[2.21238513,
2.148624672,
1.851441511,
1.582833121,
0.922855119,
1.409181214,
1.523630958,
1.309420189
]})

I just tried dfs.update as suggested and it didn't work either. This is what was returned with:

dfs.update(dfs.div(dfs.sum(axis=0),axis=1))

BENY · Accepted Answer

The reason why you have the same output , since your columns have the same distribution ,check out

dfs['7006091']/dfs['7007772']
0    1.003329
1    1.003329
2    1.003329
3    1.003329
4    1.003329
5    1.003329
6    1.003329
7    1.003329
dtype: float64

So they are just same value after we standarlized with column sum

Pandas - dividing each cell by column sum - But it returns the same value

Answers (2)

Related Questions