maheshakya
maheshakya

Reputation: 2238

Adding up values in a 2D array in Python

I have a numpy 2D array as follows

gona = array([['a1', 3], ['a2', 5], ['a3', 1], ['a3', 2], ['a3', 1], ['a1', 7]])

This array has 2 columns

What I want to do is create an array with 2 columns. Column 1 should have 'a1' , 'a2', 'a3' values in its' rows and column 2 should have summation of those corresponding values.

new_gona = array([['a1', 10], ['a2', 5], ['a3', 4]])

Here, corresponding values are taken as follows.

'a1' : 3 + 7 = 10
'a2' : 5 
'a3' : 1 + 2 + 1 = 4

What would be an easy method to achieve this?

Upvotes: 0

Views: 701

Answers (5)

ssm
ssm

Reputation: 5373

Then, the list comprehension will do it pretty easy:

def fst(x): return x[0]
[(a, sum([int(m[1]) for m in gona if a == m[0]])) for a in set(map(fst, gona)) ]    

This is basic Python. No libraries involved. The first function is defined only avoid the lambda expression in the map at the end. Both the Pandas and the NumPy solutions already mentioned seem pretty interesting though. +1 for both!

Upvotes: 0

Jaime
Jaime

Reputation: 67427

A numpy only solution:

>>> labels, indices = np.unique(gona[:, 0], return_inverse=True)
>>> sums = np.bincount(indices, weights=gona[:, 1].astype(np.float))
>>> new_gona = np.column_stack((labels, sums))
>>> new_gona
array([['a1', '10'],
       ['a2', '5.'],
       ['a3', '4.']], 
      dtype='|S2')

Upvotes: 2

eric chiang
eric chiang

Reputation: 2755

Use pandas and its indexing magic:

import pandas as pd
import numpy as np

gona = np.array([['a1', 3], ['a2', 5], ['a3', 1], 
              ['a3', 2], ['a3', 1], ['a1', 7]])

# Create series where second items are data and first items are index
series = pd.Series(gona[:,1],gona[:,0],dtype=np.float)

# Compute sums across index
sums = series.sum(level=0)

# Construct new array in the format you want
new_gona = np.array(zip(sums.index,sums.values))

new_gona
# out[]:
# array([['a1', '10.0'],
#        ['a2', '5.0'],
#        ['a3', '4.0']], 
#       dtype='|S4')

It's also notable that np.arrays can only hold one datatype. So your mixing of strings and numeric types needs to be corrected for by specifying dtype=np.float. You can use np.int if you want.

Upvotes: 3

Jakob Bowyer
Jakob Bowyer

Reputation: 34698

from collections import defaultdict 
from operator import itemgetter

sums = defaultdict(int)
for key, value in gona:
    sums[key] += value

new_gona = sorted(sums.iteritems(), key=itemgetter(0))

Cheat?

Upvotes: 1

lordkain
lordkain

Reputation: 3109

you have to write a loop around gona and store the (a1) as a key in dictionary object. The value should be added ofcourse

Upvotes: -2

Related Questions