Reputation: 31
I need to combine two lists and then count all the values corresponding to a certain value.
The two lists are:
inControl = ["False", "False", "True", "True","False", "True", "False", "True", "True", "False", "False", "False", "False", "False", "False", "True", "False", "True", "False", "False"]
rts = [379, 396, 480, 443, 365, 280, 487, 446, 350, 367, 405, 391, 484, 359, 367, 305, 359, 479, 436, 333]
I need to sum all of the rts
corresponding to all of the False
values, and then the same for True
values (they are all in order).
I've basically got as far as combining the two lists using as a zip
function, but am completely lost as to what to do next... any help would be appreciated.
Many thanks:)
Upvotes: 0
Views: 69
Reputation: 104092
You can produce tuples that have a single number either in the False
position or True
position in a list comprehension or generator exrpession:
>>> [(e,0) if c=='False' else (0,e) for e, c in zip(rts, inControl)]
[(379, 0), (396, 0), (0, 480), (0, 443), (365, 0), (0, 280), (487, 0), (0, 446), (0, 350), (367, 0), (405, 0), (391, 0), (484, 0), (359, 0), (367, 0), (0, 305), (359, 0), (0, 479), (436, 0), (333, 0)]
You can then sum series of tuples with reduce
:
>>> reduce(lambda x, y: (x[0]+y[0], x[1]+y[1]), (((e,0) if c=='False' else (0,e) for e, c in zip(rts, inControl))))
(5128, 2783)
You can actually use False
and True
booleans to access the tuple:
>>> t=reduce(lambda x, y: (x[0]+y[0], x[1]+y[1]), ((e if c=='False' else 0, e if c=='True' else 0) for e, c in zip(rts, inControl)))
>>> t[False]
5128
>>> t[True]
2783
Or, you can use map
if that makes more sense to you:
>>> map(sum, zip(*((e,0) if c=='False' else (0,e) for e, c in zip(rts, inControl))))
[5128, 2783]
Or, you can create a dict with the sums:
>>> dict(zip([False, True], map(sum, zip(*[(e,0) if c=='False' else (0,e) for e, c in zip(rts, inControl)]))))
{False: 5128, True: 2783}
If you have Pandas, a great way to do this is with .groupby()
and sum
:
>>> import pandas as pd
>>> df=pd.DataFrame({'rts':rts, 'inControl':inControl})
>>> df
inControl rts
0 False 379
1 False 396
2 True 480
3 True 443
4 False 365
5 True 280
6 False 487
7 True 446
8 True 350
9 False 367
10 False 405
11 False 391
12 False 484
13 False 359
14 False 367
15 True 305
16 False 359
17 True 479
18 False 436
19 False 333
>>> df.groupby(inControl).sum()
rts
False 5128
True 2783
Upvotes: 0
Reputation: 40773
Given:
inControl = ["False", "False", "True", "True","False", "True", "False", "True", "True", "False", "False", "False", "False", "False", "False", "True", "False", "True", "False", "False"]
rts = [379, 396, 480, 443, 365, 280, 487, 446, 350, 367, 405, 391, 484, 359, 367, 305, 359, 479, 436, 333]
I assume that inControl
is a list of strings, not a list of booleans. For my solution to work, I am going to convert inControl
to list of booleans:
inControl = [element == 'True' for element in inControl] # ==> [False, False, ...]
Use itertools.compress
to calculate sum of all True
elements:
import itertools
true_sum = sum(itertools.compress(rts, inControl)) # 2783
Now, we can calculate the false_sum
:
grand_sum = sum(rts) # 7911
false_sum = grand_sum - true_sum # 5128
Upvotes: 0
Reputation: 20765
Jean-François's solution will work just fine and is pretty readable. It does make two passes over the data, though. If the list is small this isn't a big deal, but if it's big you can roughly halve the time of operation by taking a single pass.
One general approach is this:
totals = {}
for flag, value in zip(inControl, rts):
totals[flag] = totals.setdefault(flag, 0) + value
This code does not assume that inControl
has only False
and True
. It can in fact have any number of unique values.
A cuter way is to use the Counter
class from the collections
module. A Counter
is a dictionary intended to keep track of counts. Adding two counters does the obvious thing: the values of identical keys are summed. We can create a Counter
instance for each pair of elements and add up all of the counters. Note that creating a Counter
for each element is probably overkill -- the above solution is more efficient. But for educational purposes, this solution looks like:
from collections import Counter
counters = (Counter({k: v}) for k, v in zip(inControl, rts))
sum(counters, Counter())
Upvotes: 2
Reputation: 140286
Seems weird that the booleans are strings, but...
Use zip
, then sum
on matching elements.
(re-create the zip
for the other part in Python 3 because zip
is an iterable)
inControl = ["False", "False", "True", "True","False", "True", "False", "True", "True", "False", "False", "False", "False", "False", "False", "True", "False", "True", "False", "False"]
rts = [379, 396, 480, 443, 365, 280, 487, 446, 350, 367, 405, 391, 484, 359, 367, 305, 359, 479, 436, 333]
z=zip(rts,inControl)
sf=sum(x[0] for x in z if x[1]=='False')
z=zip(rts,inControl)
st=sum(x[0] for x in z if x[1]=='True')
print(sf,st)
result:
5128 2783
Maybe st
could be computed with less string comparisons with sum
: st=sum(rts)-sf
(more additions, less string comparisons)
variant: small loop for True & False
s=dict()
for c in ['False','True']:
z=zip(rts,inControl)
s[c]=sum(x[0] for x in z if x[1]==c)
Upvotes: 2