chowpay
chowpay

Reputation: 1687

is it possible to use a generator to load and write out a dataframe

I have a very large data set in pandas that I want to write out to a file. Currently my method is this:

df_dict = df2.to_dict('records')

filename = newfile.json:
for item in df2_dict:
    with open('{0}'.format(filename), 'a+') as outfile:
        json.dump(item,outfile,separators = (',',':'))
        outfile.write('\n')

This is very memory intensive. What I would prefer to do is some how convert 1 line of df2 to a dict then write that out to the newfile.json instead of converting the whole table to a dict first. But I don't know if that's possible or what the best method is.

Upvotes: 0

Views: 88

Answers (1)

karthik_ghorpade
karthik_ghorpade

Reputation: 374

You could use Keras ImageDataGenerator Class flow_from_dataframe method (https://keras.io/api/preprocessing/image/). I recently used a similar approach for an assignment. This blog could help you get started with it - https://medium.com/@vijayabhaskar96/tutorial-on-keras-flow-from-dataframe-1fd4493d237c

Upvotes: 1

Related Questions