Reputation: 2155
I have several data frames and need to do the same thing to all of them.
I'm currently doing this:
df1=df1.reindex(newindex)
df2=df2.reindex(newindex)
df3=df3.reindex(newindex)
df4=df4.reindex(newindex)
Is there a neater way of doing this?
Maybe something like
df=[df1,df2,df3,df4]
for d in df:
d=d.reindex(newindex)
Upvotes: 1
Views: 1101
Reputation: 3790
If you have a lot of large dataframes, you can use multiple threads. I suggest using the pathos module (can be installed using pip install pathos):
from pathos.multiprocessing import ThreadPool
# create a thread pool with the max number of threads
tPool = ThreadPool()
# apply the same function to each df
# the function applies to your list of dataframes
newDFs = tPool.map(lambda df: df.reindex(newIndex),dfs)
Upvotes: 1
Reputation: 862641
Yes, your solution is good only necessary assign to new list of DataFrame
s by list comprehension:
dfs = [df1,df2,df3,df4]
dfs_new = [d.reindex(newindex) for d in dfs]
Nice solution with unpack like suggest @Joe Halliwell, thank you:
df1, df2, df3, df4 = [d.reindex(newindex) for d in dfs]
Or like suggest @roganjosh is possible create dictionary of DataFrames:
dfs = [df1,df2,df3,df4]
names = ['a','b','c','d']
dfs_new_dict = {name: d.reindex(newindex) for name, d in zip(names, dfs)}
And then select each DataFrame by key:
print (dfs_new_dict['a'])
Sample:
df = pd.DataFrame({'a':[4,5,6]})
df1 = df * 10
df2 = df + 10
df3 = df - 10
df4 = df / 10
dfs = [df1,df2,df3,df4]
print (dfs)
[ a
0 40
1 50
2 60, a
0 14
1 15
2 16, a
0 -6
1 -5
2 -4, a
0 0.4
1 0.5
2 0.6]
newindex = [2,1,0]
df1, df2, df3, df4 = [d.reindex(newindex) for d in dfs]
print (df1)
print (df2)
print (df3)
print (df4)
a
2 60
1 50
0 40
a
2 16
1 15
0 14
a
2 -4
1 -5
0 -6
a
2 0.6
1 0.5
0 0.4
Or:
newindex = [2,1,0]
names = ['a','b','c','d']
dfs_new_dict = {name: d.reindex(newindex) for name, d in zip(names, dfs)}
print (dfs_new_dict['a'])
print (dfs_new_dict['b'])
print (dfs_new_dict['c'])
print (dfs_new_dict['d'])
Upvotes: 2