amaatouq
amaatouq

Reputation: 2337

Efficient way to search a list of dictionaries in Python

I have the current list of dictionaries that look like this:

  x =[{u'id': 813, u'overlap': 1.0},
      {u'id': 811, u'overlap': 0.002175817439439302},
      {u'id': 812, u'overlap': 0.012271742728263339},
      {u'id': 814, u'overlap': 0.9182077233294997},
      {u'id': 815, u'overlap': 0.8866809411969082},
      {u'id': 117, u'overlap': 0.9173796235219325},
      {u'id': 816, u'overlap': 0.9460961805523018},
      {u'id': 116, u'overlap': 0.2038311249297872},
      {u'id': 817, u'overlap': 0.7302323133830623},
      {u'id': 818, u'overlap': 0.81532953091762},
      {u'id': 819, u'overlap': 0.2817392052504116},
      {u'id': 820, u'overlap': 0.7905202140586483},
      {u'id': 821, u'overlap': 0.8497466449368322},
      {u'id': 822, u'overlap': 0.8503886863531487},
      {u'id': 823, u'overlap': 1.0}]

and I want to find for example the overlap of id==820 which is 0.7905202140586483

how can I do it efficiently and elegantly in very few lines of python code? (I will loop over millions of such arrays).

Upvotes: 3

Views: 1698

Answers (4)

knbk
knbk

Reputation: 53699

Efficiency depends on the situation. It's worth noting that converting it do a dict doesn't come without a cost. If you use almost all the items, convert it to a dict like suggested. If you only ever use a few items in the list, this will be more efficient:

d = {v['id']: v['overlap'] for v in x if v['id'] in (820, 811, 117)}

A small test (with the list in your question) shows that this gives a ~33% decrease in time usage if you're looking for just 2/15 items. At more than 5-6 of the 15 items it was no longer faster.

You'll have to test yourself how this scales to larger lists (you can use timeit.timeit for that). If you are able to create a dict instead of a list, go for it. Otherwise, if this is a performance-critical part of your application, do some tests and see what works best for your situation.

Upvotes: 3

Zachary Cross
Zachary Cross

Reputation: 2318

Because each dictionary only has two values (an 'id' and an 'overlap'), I would suggest that you try converting the whole thing into one large dictionary, and then go from there. Something like:

x_dict = {entry['id']: entry['overlap'] for entry in x}

Then you can get the value you want with a call to .get():

x_dict.get(id)

Upvotes: 4

Marcin
Marcin

Reputation: 238507

You can do as follows using dictionary comprehension:

a_dict = {v['id']: v['overlap'] for v in x};

This results in:

for id,overlap in a_dict.items():
    print(id, overlap)

# output

811 0.002175817439439302
812 0.012271742728263339
813 1.0
814 0.9182077233294997
815 0.8866809411969082
816 0.9460961805523018
817 0.7302323133830623
818 0.81532953091762
819 0.2817392052504116
116 0.2038311249297872
117 0.9173796235219325
822 0.8503886863531487
823 1.0
820 0.7905202140586483
821 0.8497466449368322

Upvotes: 0

Joran Beasley
Joran Beasley

Reputation: 114038

x2 = {d["id"]:d["overlap"] for d in x}
print x2[820]

as mentioned in the comments use a dict

or query from mongo where id=820 (not sure offhand how.. only used mongo a handful of times)

Upvotes: 3

Related Questions