C. Refsgaard
C. Refsgaard

Reputation: 227

Writing dictionary of dataframes to file

I have a dictionary, and for each key in my dictionary, I have one pandas dataframe. The dataframes from key to key are of unequal length.

It takes some time to get to the dataframes that are connected to each key, and therefore I wish to save my dictionary of dataframes to a file, so I can just read the file into Python instead of running my script every time I open Python.

My question is: How would you suggest to write the dictionary with dataframes to a file - and to read it in again? I have tried the following, where dictex is the dictionary:

w = csv.writer(open("output.csv", "w"))
for key, val in dictex.items():
    w.writerow([key, val])

But I am not really sure if I get what I want, as I struggle to read the file into Python again.

Thank you for your time.

Upvotes: 9

Views: 13087

Answers (3)

Achraf BELLA
Achraf BELLA

Reputation: 21

I added some new attributes to the function that written there, in order to insert in a certain file, insert it with an index if you want to do it twice and delete the unwanted column when you save your dataframe.

def saver(dictex, type_, output):
    for key, val in dictex.items():
        val.to_csv("{output}/data_{}_{}.csv".format(str(key), str(type_)), index=False)

    with open(f"{output}/keys_{type_}.txt", "w") as f: #saving keys to file
        f.write(str(list(dictex.keys())))

def loader(type_, output='your_file'):
    """Reading data from keys"""
    with open(f"{output}/keys_{type_}.txt", "r") as f:
        keys = eval(f.read())

    dictex = {}    
    for key in keys:
        dictex[key] = pd.read_csv("{}/data_{}_{}.csv".format(output, str(key), str(type_)))

    return dictex

Upvotes: 0

TheProletariat
TheProletariat

Reputation: 1056

You can use pickle to save a dictionary of dataframes in python.

import pickle

df1 = pd.DataFrame(data={'a':[1,2,3], 'b':[4,5,6]})
df2 = pd.DataFrame(data={'a':[5,5,5,5,5], 'b':[5,5,5,5,5]})

d = {}
d['df1'] = df1
d['df2'] = df2

with open('dict_of_dfs.pickle', 'wb') as f:
    pickle.dump(d, f)

Upvotes: 6

artona
artona

Reputation: 1272

Regarding the rule of saving data frames independently and not using SQL solution (or another database format) it could be the following code:

import csv
import pandas as pd 

def saver(dictex):
    for key, val in dictex.items():
        val.to_csv("data_{}.csv".format(str(key)))

    with open("keys.txt", "w") as f: #saving keys to file
        f.write(str(list(dictex.keys())))

def loader():
    """Reading data from keys"""
    with open("keys.txt", "r") as f:
        keys = eval(f.read())

    dictex = {}    
    for key in keys:
        dictex[key] = pd.read_csv("data_{}.csv".format(str(key)))

    return dictex

(...)

dictex = loader()

Upvotes: 2

Related Questions