Cellydy
Cellydy

Reputation: 1339

2d-list calculations

I have two 2-dimensional lists. Each list item contains a list with a string ID and an integer. I want to subtract the integers from each other where the string ID matches.

List 1:

list1 = [['ID_001',1000],['ID_002',2000],['ID_003',3000]]

List 2:

list2 = [['ID_001',500],['ID_003',1000],['ID_002',1000]]

I want to end up with

difference = [['ID_001',500],['ID_002',1000],['ID_003',2000]]

Notice that the elements aren't necessarily in the same order in both lists. Both lists will be the same length and there is an integer corresponding to each ID in both lists.

I would also like this to be done efficiently as both lists will have thousands of records.

Upvotes: 1

Views: 90

Answers (2)

Dimitris Fasarakis Hilliard
Dimitris Fasarakis Hilliard

Reputation: 160377

You could achieve this by using a list comprehension:

diff = [(i[0], abs(i[1] - j[1])) for i,j in zip(sorted(list1), sorted(list2))]

This first sorts the lists with sorted in order for the order to be similar (not with list.sort() which sorts in place) and then, it creates tuples containing each entry in the lists ['ID_001', 1000], ['ID_001', 500] by feeding the sorted lists to zip.

Finally:

(i[0], abs(i[1] - j[1]))

returns i[0] indicating the ID for each entry and abs(i[1] - j[1]) computes their absolute difference. There are added as a tuple in the final list result (note the parentheses surrounding them).


In general, sorted might slow you down if you have a large amount of data, but that depends on how disorganized the data is from what I'm aware.

Other than that, zip creates an iterator so memory wise it doesn't affect you. Speed wise, list comps tend to be quite efficient and in most cases are your best options.

Upvotes: 2

dashiell
dashiell

Reputation: 812

from collections import defaultdict

diffs = defaultdict(int)
list1 = [['ID_001',1000],['ID_002',2000],['ID_003',3000]]
list2 = [['ID_001',500],['ID_003',1000],['ID_002',1000]]
for pair in list1:
    diffs[pair[0]] = pair[1]
for pair in list2:
    diffs[pair[0]] -= pair[1]

differences = [[k,abs(v)] for k,v in diffs.items()]
print(differences)

I was curious so I ran a few timeits comparing my answer to Jim's. They seem to run in about the same time. You can cut the runtime of mine in half if you're willing to accept the output as a dictionary, however.

His is, of course, more Pythonic, if that's important to you.

Upvotes: 2

Related Questions