Memory usage/efficiency for pandas dataframe versus lists versus tuples, etc.

Question

I'm trying to create a class in Python that ends up storing some text documents along with some metadata for each of the documents. Think of a structure like this:

ID    Text                        Date       Followers
1     "This is a tweet"           10/21/14   57
2     "This is another tweet"     10/22/14   100
3     "Yet another"               10/23/14   3899 
4     "Another one"               10/25/14   234

What's the best and most memory efficient way to store stuff like this? Is it as four different lists (for example)? Or maybe a dictionary and/or tuples? Or as a Pandas Dataframe?

Are there significant differences between each one? I would like to store them as a Pandas dataframe just for ease of working with the data, but I also want to be mindful of memory usage and speed for larger datasets.

Memory usage/efficiency for pandas dataframe versus lists versus tuples, etc.

Answers (1)

Related Questions