eternity1
eternity1

Reputation: 681

Summing elements in a list of "list", each with different index

From a loop I have a variable A:

aa = pd.Series(np.random.randn(5))
aaaa = []
aaaa.append(aa.loc[[1]])
aaaa.append(aa.loc[[4]])
aaaa

[1    0.07856
 dtype: float64, 4    0.94552
 dtype: float64]

Now I would like to sum up (or do any other calculation) the elements within A. I tried with the sum-function, but unfortunately it does not work. For instance,

B = sum(aaaa)

gives me

1   NaN
4   NaN
dtype: float64

I have found below question and solutions, however, it does not work for my problem as the TO has only one list, and not several lists appended to each other (with different indexing)

Summing elements in a list

edit4: As I have to run this multiple times, I timed both answers:

%timeit sum([i.values for i in aaaa])
3.78 µs ± 5.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit pd.concat(aaaa).sum()
560 µs ± 15.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

surprisingly the "loop" inside the sum is much faster than the pd.Series.concat().sum() function

edit5: to add in case someone else has the same problem: in case if it's not know whether the input is a pd.Series or a list of pd.Series, one could do following:

res = sum(aa) if isinstance(aa, pd.Series) else sum([i.values for i in aa])

Upvotes: 2

Views: 292

Answers (2)

jpp
jpp

Reputation: 164843

You are misusing pd.Series.loc, which is causing your list elements to be pd.Series instead of scalars.

Try using pd.Series.iloc for integer indexing:

s = pd.Series(np.random.randn(5))

A = []
A.append(s.iloc[1])
A.append(s.iloc[4])

res = sum(A)

Note you could perform this calculation directly via pd.Series.sum:

res = s.iloc[[1, 4]].sum()

If you have a list of pd.Series, you can use:

res = pd.concat(A).sum()

Upvotes: 2

ahed87
ahed87

Reputation: 1360

There are many ways to get out of your prediciment and only you will know the one that is best for you.

When you do aa.loc[[1]] you end up with a pd.Series, if you do aa.loc[1] you will get a scalar, as well with .iloc.

So just by dropping the 2nd pair of square brackets in aa.loc[[1]] will make your code work.

sum needs an iterable with numbers to work. So if you want to keep the 2nd pair of square brackets the following line will work as well, although you will now get a numpy array instead of a float as answer.

sum([i.values for i in aaaa])

Upvotes: 2

Related Questions