chengtah
chengtah

Reputation: 11

Writing dictionary of dataframes to a single file

Trying to capture multiple years of daily updated 2-D tables. I can download them to a dictionary of dataframes. Trying to write it to a CSV file, so I do not have to download it every time.

import csv
import pandas as pd 

def saver(dictex):
    for key, val in dictex.items():
        val.to_csv("data_{}.csv".format(str(key)))

    with open("keys.txt", "w") as f: #saving keys to file
        f.write(str(list(dictex.keys()))

def loader():
    """Reading data from keys"""
    with open("keys.txt", "r") as f:
        keys = eval(f.read())
    dictex = {}    
    for key in keys:
        dictex[key] = pd.read_csv("data_{}.csv".format(str(key)))

    return dictex

dictex = loader()

It can save all the keys and values in different files. My next step is to put all the data in one file.

I tried the following method, but it seems to only work with 1d dictionary. As it cannot read back with the following error message.

"ValueError: dictionary update sequence element #1 has length 0; 2 is required"

with open('datadict.csv', 'w', encoding='utf-8-sig') as csv_file:
    writer = csv.writer(csv_file)
    for key, value in data.items():
        writer.writerow([key, value])
with open('datadict.csv', encoding='utf-8-sig') as csv_file:
    reader = csv.reader(csv_file)
    mydict = dict(reader)

Here is a hand-made data set similar to what I am working with. I would like to wirte dictdf to a csv and read it back with the same structure.

import pandas as pd
import numpy as np
dates = pd.date_range('1/1/2000', periods=8)
df1 = pd.DataFrame(np.random.randn(8, 4),
index=dates, columns=['A', 'B', 'C', 'D'])

dates2 = pd.date_range('1/1/2000', periods=8)
df2 = pd.DataFrame(np.random.randn(8, 4),
index=dates, columns=['A', 'B', 'C', 'D'])

dictdf={}
dictdf['xxset']=df1
dictdf['yyset']=df2

Thanks for your attention.

Upvotes: 0

Views: 1498

Answers (1)

Niels Henkens
Niels Henkens

Reputation: 2696

I don't know what the exact structure of your keys.txt is or your csv's, but based on your code, I'd suspect something like this to join all csv's into one DataFrame.

import pandas as pd

"""Reading data from keys"""
with open("keys.txt", "r") as f:
    keys = eval(f.read())
list_of_dfs = []

# Read in all csv files and append to list
for key in keys:
    list_of_dfs.append(pd.read_csv("data_{}.csv".format(str(key)))) # based on your example

# Join all dataframes into 1 big one
big_df = pd.concat(list_of_dfs)

EDIT

If you want to save the dictionary with the dataframes to 1 file, saving it as a pickle file might be a better option. See this question .

Upvotes: 1

Related Questions