Reputation: 95
I have a dataset that reads 13 csv files per column. Each row represents clusters (4 total) and the data reflects the frequency that each file appears in each of those clusters. My questions is how do I convert each cell into percentages? This is how my data frame looks like right now:
This is what I would like it to look like:
row_0 001.csv 002.csv 003.csv 004.csv 005.csv
0 0% 0.35%
1 86.08% 0%
2 0.07% 0%
3 0.06% 1.24%
Each value in each cell is out of 10,000.
Upvotes: 1
Views: 311
Reputation: 150745
Since your total is 10000
, you can just divide by 100
and format:
# random data
np.random.seed(1)
df = pd.DataFrame(np.random.choice([2345, 123, 6789],
size=(5,5))
)
df.div(100).astype('str').add('%')
Output:
0 1 2 3 4
0 1.23% 23.45% 23.45% 1.23% 1.23%
1 23.45% 23.45% 1.23% 23.45% 1.23%
2 23.45% 67.89% 1.23% 67.89% 23.45%
3 67.89% 1.23% 67.89% 23.45% 23.45%
4 67.89% 23.45% 1.23% 67.89% 67.89%
Upvotes: 2