Why does a file stored as a dictionary take up much more space than file

Question

I have a file of size 500MB if I store each line of that file in a dictionary setup like

file = "my_file.csv"
with open(file) as f:
    for l in f:
        delimiter = ','
        line = l.split(delimiter)
        hash_key = delimiter.join(line[:4])
        store_line = delimiter.join(line[4:])
        store_dict[hash_key] = store_line

To check my memory, I compared the memory usage of my program by watching htop, first with the above, then switching the last line to

print(hash_key + ":" + store_line)

And that took < 100MB of memory.

the size of my store_dict is approximately 1.5GB in memory. I have checked for memory leaks, I can't find any. Removing this line store_dict[hash_key] = store_line results in the program taking < 100MB of memory. Why does this take up so much memory? Is there anyway to store the lines as a dictionary and not have it take up so much memory?

Why does a file stored as a dictionary take up much more space than file

Answers (1)

Related Questions