Dimitris
Dimitris

Reputation: 435

use names of items in list of pandas

I have successfully finished data manipulation using pandas (in python). Depending on my starting data set I end up with a series of data frames - let's say for example sampleA, sampleB, and sample C. I want to automate saving of these datasets (can be a lot of them) with also a unique identifier in the name so I create a list of pandas, and use a loop to save the data - I cannot though make the loop give a unique name each time - see for example:

import numpy as np
import pandas as pd
sampleA= pd.DataFrame(np.random.randn(10, 4))
sampleB= pd.DataFrame(np.random.randn(10, 4))
sampleC= pd.DataFrame(np.random.randn(10, 4))
allsamples=(sampleA, sampleB, sampleC)
for x in allsamples:
    #name = allsamples[x]
    #x.to_csv(name + '.dat', sep=',', header = False, index = False)
    x.to_csv(x + '.dat', sep=',', header = False, index = False)

when I am using the above (with not the commented lines) all data are saved as x.data, and I keep only the latest dataset; if I do the name line, then i get errors any idea how I can come up with a naming approach so I can save 3 files named sampleA.dat, sampleB.data, and sampleC.dat

Upvotes: 2

Views: 847

Answers (2)

unutbu
unutbu

Reputation: 879133

If you use strings, then you can look up the variable of the same name using vars():

allsamples = ('sampleA', 'sampleB', 'sampleC')
for name in allsamples:
    df = vars()[name]
    df.to_csv(name + '.dat', sep=',', header=False, index=False)

Without an argument vars() is equivalent to locals(). It returns a "read-only" dict mapping local variable names to their associated values. (The dict is "read-only" in the sense that it is mainly useful for looking up the value of local variables. Like any dict, it is modifiable, but modifying the dict will not modify the variable.)

Upvotes: 2

alko
alko

Reputation: 48297

Be aware that python tuple items have no names. And moreover, allsamples[x] is meaningless, you index tuple with a dataframe, what do you expect to get?

One can use a dictionary instead of a tuple for simultanious variables naming and storing:

all_samples = {'sampleA':sampleA, 'sampleB':sampleB, 'sampleC':sampleC}
for name, df in all_samples.items():
    df.to_csv('{}.dat'.format(name), sep=',', header = False, index = False)

Upvotes: 2

Related Questions