Reputation: 1339
I have two 2-dimensional lists. Each list
item contains a list
with a string ID and an integer. I want to subtract the integers from each other where the string ID matches.
List 1:
list1 = [['ID_001',1000],['ID_002',2000],['ID_003',3000]]
List 2:
list2 = [['ID_001',500],['ID_003',1000],['ID_002',1000]]
I want to end up with
difference = [['ID_001',500],['ID_002',1000],['ID_003',2000]]
Notice that the elements aren't necessarily in the same order in both lists. Both lists will be the same length and there is an integer corresponding to each ID in both lists.
I would also like this to be done efficiently as both lists will have thousands of records.
Upvotes: 1
Views: 90
Reputation: 160377
You could achieve this by using a list comprehension:
diff = [(i[0], abs(i[1] - j[1])) for i,j in zip(sorted(list1), sorted(list2))]
This first sorts the lists with sorted
in order for the order to be similar (not with list.sort()
which sorts in place) and then, it creates tuples containing each entry in the lists ['ID_001', 1000], ['ID_001', 500]
by feeding the sorted lists to zip
.
Finally:
(i[0], abs(i[1] - j[1]))
returns i[0]
indicating the ID
for each entry and abs(i[1] - j[1])
computes their absolute difference. There are added as a tuple in the final list result (note the parentheses surrounding them).
In general, sorted
might slow you down if you have a large amount of data, but that depends on how disorganized the data is from what I'm aware.
Other than that, zip
creates an iterator so memory wise it doesn't affect you. Speed wise, list comps tend to be quite efficient and in most cases are your best options.
Upvotes: 2
Reputation: 812
from collections import defaultdict
diffs = defaultdict(int)
list1 = [['ID_001',1000],['ID_002',2000],['ID_003',3000]]
list2 = [['ID_001',500],['ID_003',1000],['ID_002',1000]]
for pair in list1:
diffs[pair[0]] = pair[1]
for pair in list2:
diffs[pair[0]] -= pair[1]
differences = [[k,abs(v)] for k,v in diffs.items()]
print(differences)
I was curious so I ran a few timeits comparing my answer to Jim's. They seem to run in about the same time. You can cut the runtime of mine in half if you're willing to accept the output as a dictionary, however.
His is, of course, more Pythonic, if that's important to you.
Upvotes: 2