Reputation: 464
Let's say I have a function that returns 1000 records from a postgres database as a list of dicts that looks like this (but much bigger):
[ {"thing_id" : 245, "thing_title" : "Thing title", "thing_url": "thing-url"},
{"thing_id" : 459, "thing_title" : "Thing title II", "thing_url": "thing-url/2"}]
I have a process that requires around 600 individual searches on this list for the right dict based on a given unique thing_id
. Rather than iterating through the entire list each time, wouldn't it be more efficient to create a dict of dicts, making the thing_id
for each dict a key, like this:
{245 : {"thing_id" : 245, "thing_title" : "Thing title", "thing_url": "thing-url"},
459 : {"thing_id" : 459, "thing_title" : "Thing title II", "thing_url": "thing-url/2"}}
If so, is there a preferred way of doing this? Obviously I could build the dict by iterating through the list. But was wondering if there are any built in methods for this. If not, what is the preferred way of going about this? Also, is there a better way of repeatedly retrieving data from the same large set of records than what I am proposing here, please let me know.
UPDATE: Ended up going with dict comprehension:
data = {row["thing_id"]: row for row in rows}
where rows is the result from my db query with a psycopg2.extras.DictCursor. Building the dict is fast enough and the lookups are very fast.
Upvotes: 0
Views: 94
Reputation: 640
a = [ {"thing_id" : 245, "thing_title" : "Thing title", "thing_url": "thing-url"}, {"thing_id" : 459, "thing_title" : "Thing title II", "thing_url": "thing-url/2"}]
c = [b.values()[1] for b in a]
Upvotes: 0
Reputation: 46636
You can use the pandas DataFrame structure for multi column indexing:
>>> result = [
{"thing_id" : 245, "thing_title" : "Thing title", "thing_url": "thing-url"},
{"thing_id" : 459, "thing_title" : "Thing title II", "thing_url": "thing-url/2"}
]
>>> df = pd.DataFrame(result)
>>> df.set_index('thing_id', inplace=True)
>>> df.sort_index(inplace=True)
>>> df
thing_title thing_url
thing_id
245 Thing title thing-url
459 Thing title II thing-url/2
>>> df.loc[459, 'thing_title']
'Thing title II'
Upvotes: 1