Reputation: 420
I just got started with numpy..
Got the below np table and would like calculate the each cell's percentage of the column total.
data = np.array([[7,16,17], [12,11,3]])
headers = ["Grundskola", "Gymn", "Akademisk"]
# tabulate data
table = tabulate(data, headers, tablefmt="github")
# output
print(table)
| Grundskola | Gymn | Akademisk |
|--------------|--------|-------------|
| 7 | 16 | 17 |
| 12 | 11 | 3 |
to:
| Grundskola | Gymn | Akademisk |
|--------------|--------|-------------|
| 39%| 59% | 85% |
| 61%| 41% | 15% |
I know that np.sum(data2, axis=0/1) will give me the totals but how do I use to calculate the array.
The array can vary in size...
Upvotes: 6
Views: 10988
Reputation: 20669
You can try this. Use numpy.sum
over axis = 0
and divide the array data
with sum over axis 0.
data = np.array([[7, 16, 17],
[12, 11, 3]])
percentages = data/data.sum(axis=0) * 100
percentages
# array([[36.84210526, 59.25925926, 85. ],
# [63.15789474, 40.74074074, 15. ]])
Now, use this percentages
in tabulate
function.
You can format mini string language
to format them as below.
perc = data / data.sum(axis=0)
# array([[0.36842105, 0.59259259, 0.85 ],
# [0.63157895, 0.40740741, 0.15 ]])
print(np.array([[f"{i:.2%}" for i in val] for val in perc]))
# [['36.84%' '59.26%' '85.00%']
# ['63.16%' '40.74%' '15.00%']]
Upvotes: 6