use names of items in list of pandas

Question

I have successfully finished data manipulation using pandas (in python). Depending on my starting data set I end up with a series of data frames - let's say for example sampleA, sampleB, and sample C. I want to automate saving of these datasets (can be a lot of them) with also a unique identifier in the name so I create a list of pandas, and use a loop to save the data - I cannot though make the loop give a unique name each time - see for example:

import numpy as np
import pandas as pd
sampleA= pd.DataFrame(np.random.randn(10, 4))
sampleB= pd.DataFrame(np.random.randn(10, 4))
sampleC= pd.DataFrame(np.random.randn(10, 4))
allsamples=(sampleA, sampleB, sampleC)
for x in allsamples:
    #name = allsamples[x]
    #x.to_csv(name + '.dat', sep=',', header = False, index = False)
    x.to_csv(x + '.dat', sep=',', header = False, index = False)

when I am using the above (with not the commented lines) all data are saved as x.data, and I keep only the latest dataset; if I do the name line, then i get errors any idea how I can come up with a naming approach so I can save 3 files named sampleA.dat, sampleB.data, and sampleC.dat

unutbu · Accepted Answer

If you use strings, then you can look up the variable of the same name using vars():

allsamples = ('sampleA', 'sampleB', 'sampleC')
for name in allsamples:
    df = vars()[name]
    df.to_csv(name + '.dat', sep=',', header=False, index=False)

Without an argument vars() is equivalent to locals(). It returns a "read-only" dict mapping local variable names to their associated values. (The dict is "read-only" in the sense that it is mainly useful for looking up the value of local variables. Like any dict, it is modifiable, but modifying the dict will not modify the variable.)

use names of items in list of pandas

Answers (2)

Related Questions