fred.schwartz
fred.schwartz

Reputation: 2155

Use Dataframe names in loop Pandas

I have several data frames and need to do the same thing to all of them.

I'm currently doing this:

df1=df1.reindex(newindex)
df2=df2.reindex(newindex)
df3=df3.reindex(newindex)
df4=df4.reindex(newindex)

Is there a neater way of doing this?

Maybe something like

df=[df1,df2,df3,df4]

for d in df:
     d=d.reindex(newindex)

Upvotes: 1

Views: 1101

Answers (2)

ma3oun
ma3oun

Reputation: 3790

If you have a lot of large dataframes, you can use multiple threads. I suggest using the pathos module (can be installed using pip install pathos):

from pathos.multiprocessing import ThreadPool

# create a thread pool with the max number of threads
tPool = ThreadPool()

# apply the same function to each df
# the function applies to your list of dataframes
newDFs = tPool.map(lambda df: df.reindex(newIndex),dfs)

Upvotes: 1

jezrael
jezrael

Reputation: 862641

Yes, your solution is good only necessary assign to new list of DataFrames by list comprehension:

dfs = [df1,df2,df3,df4]
dfs_new = [d.reindex(newindex) for d in dfs]

Nice solution with unpack like suggest @Joe Halliwell, thank you:

df1, df2, df3, df4 = [d.reindex(newindex) for d in dfs]

Or like suggest @roganjosh is possible create dictionary of DataFrames:

dfs = [df1,df2,df3,df4]
names = ['a','b','c','d']

dfs_new_dict = {name: d.reindex(newindex) for name, d in zip(names, dfs)}

And then select each DataFrame by key:

print (dfs_new_dict['a'])

Sample:

df = pd.DataFrame({'a':[4,5,6]})
df1 = df * 10
df2 = df  + 10
df3 = df - 10
df4 = df / 10
dfs = [df1,df2,df3,df4]
print (dfs)
[    a
0  40
1  50
2  60,     a
0  14
1  15
2  16,    a
0 -6
1 -5
2 -4,      a
0  0.4
1  0.5
2  0.6]

newindex = [2,1,0]
df1, df2, df3, df4 = [d.reindex(newindex) for d in dfs]
print (df1)
print (df2)
print (df3)
print (df4)
    a
2  60
1  50
0  40
    a
2  16
1  15
0  14
   a
2 -4
1 -5
0 -6
     a
2  0.6
1  0.5
0  0.4

Or:

newindex = [2,1,0]
names = ['a','b','c','d']
dfs_new_dict = {name: d.reindex(newindex) for name, d in zip(names, dfs)}

print (dfs_new_dict['a'])
print (dfs_new_dict['b'])
print (dfs_new_dict['c'])
print (dfs_new_dict['d'])

Upvotes: 2

Related Questions