Line in Linus
Line in Linus

Reputation: 420

Numpy 2D array get percentage of total

I just got started with numpy..

Got the below np table and would like calculate the each cell's percentage of the column total.

data = np.array([[7,16,17], [12,11,3]])
headers = ["Grundskola", "Gymn", "Akademisk"]

# tabulate data
table = tabulate(data, headers, tablefmt="github")

# output
print(table)

|   Grundskola |   Gymn |   Akademisk |
|--------------|--------|-------------|
|            7 |     16 |          17 |
|           12 |     11 |           3 |

to:

|   Grundskola |   Gymn |   Akademisk |
|--------------|--------|-------------|
|           39%|    59% |         85% |
|           61%|    41% |         15% |

I know that np.sum(data2, axis=0/1) will give me the totals but how do I use to calculate the array.

The array can vary in size...

Upvotes: 6

Views: 10988

Answers (1)

Ch3steR
Ch3steR

Reputation: 20669

You can try this. Use numpy.sum over axis = 0 and divide the array data with sum over axis 0.

data = np.array([[7, 16, 17],
                 [12, 11, 3]])

percentages = data/data.sum(axis=0) * 100

percentages
# array([[36.84210526, 59.25925926, 85.        ],
#        [63.15789474, 40.74074074, 15.        ]])

Now, use this percentages in tabulate function.


You can format mini string language to format them as below.

perc = data / data.sum(axis=0)
# array([[0.36842105, 0.59259259, 0.85      ],
#        [0.63157895, 0.40740741, 0.15      ]])

print(np.array([[f"{i:.2%}" for i in val] for val in perc]))
# [['36.84%' '59.26%' '85.00%']
#  ['63.16%' '40.74%' '15.00%']]

Upvotes: 6

Related Questions