Anthony O
Anthony O

Reputation: 653

What offers better performance for large datasets? Nested dictionaries or a dictionary of objects?

I find myself repeating this pattern when I am fetching from multiple database tables:

records = {'p_key': { "record": r, "A": list(), "B": list(), "C" : list() } for r in db_records}

I often have to group data this way because I cannot do joins across databases or there might be a situation where multiple queries is faster than multiple joins.

But performance-wise I am not sure if there is a lot of overhead to nesting dictionaries like this, and if I would be better served by creating an object with these attributes that becomes the value in the records dictionary. By performance I mean the overall cost in space and time when using a large set of nested dictionaries vs a dictionary of objects.

Upvotes: 1

Views: 1503

Answers (1)

monomonedula
monomonedula

Reputation: 644

There's basically no difference in performance between dictionaries and regular class objects because internally objects are using dictionaries to handle their attributes.

However, you should consider using classes with __slots__. Here is detailed explanation about what it is and its performance.

Another option is using pandas library to work with big dataset.

Upvotes: 2

Related Questions