Reputation: 435
I have successfully finished data manipulation using pandas (in python). Depending on my starting data set I end up with a series of data frames - let's say for example sampleA, sampleB, and sample C. I want to automate saving of these datasets (can be a lot of them) with also a unique identifier in the name so I create a list of pandas, and use a loop to save the data - I cannot though make the loop give a unique name each time - see for example:
import numpy as np
import pandas as pd
sampleA= pd.DataFrame(np.random.randn(10, 4))
sampleB= pd.DataFrame(np.random.randn(10, 4))
sampleC= pd.DataFrame(np.random.randn(10, 4))
allsamples=(sampleA, sampleB, sampleC)
for x in allsamples:
#name = allsamples[x]
#x.to_csv(name + '.dat', sep=',', header = False, index = False)
x.to_csv(x + '.dat', sep=',', header = False, index = False)
when I am using the above (with not the commented lines) all data are saved as x.data, and I keep only the latest dataset; if I do the name line, then i get errors any idea how I can come up with a naming approach so I can save 3 files named sampleA.dat, sampleB.data, and sampleC.dat
Upvotes: 2
Views: 847
Reputation: 879133
If you use strings, then you can look up the variable of the same name using vars():
allsamples = ('sampleA', 'sampleB', 'sampleC')
for name in allsamples:
df = vars()[name]
df.to_csv(name + '.dat', sep=',', header=False, index=False)
Without an argument vars()
is equivalent to locals()
. It returns a "read-only" dict
mapping local variable names to their associated values. (The dict
is "read-only" in the sense that it is mainly useful for looking up the value of local variables. Like any dict
, it is modifiable, but modifying the dict
will not modify the variable.)
Upvotes: 2
Reputation: 48297
Be aware that python tuple items have no names. And moreover, allsamples[x]
is meaningless, you index tuple with a dataframe, what do you expect to get?
One can use a dictionary instead of a tuple for simultanious variables naming and storing:
all_samples = {'sampleA':sampleA, 'sampleB':sampleB, 'sampleC':sampleC}
for name, df in all_samples.items():
df.to_csv('{}.dat'.format(name), sep=',', header = False, index = False)
Upvotes: 2