Curious
Curious

Reputation: 133

Running a function on multiple dataframes through for loop

I have function that performs standard preprocessing on each dataframe. I am passing 4 dataframes to that function as a list through for loop. But the changes performed in the function are not reflected back in actual dataframe. Why?

My Code :

def  merge_preprp(x):
     x[x.columns[0]] = x[x.columns[0]].astype(str)
     x[x.columns[0]]= x[x.columns[0]].str.extract('(\d+)')
     x = x[pd.notnull(x[x.columns[0]])]
     x = x[x[x.columns[0]].apply(lambda x: x.isnumeric())] 
     x[x.columns[0]] = x[x.columns[0]].astype(int)
     x.sort_values(x.columns[0], inplace = True)
     x.drop_duplicates(subset = x.columns[0], keep = 'last',inplace = True)
     return x

# dataframes A, B, C

list1 = [A,B,C]
for i in list1:
    i =merge_preprp(i)

Upvotes: 1

Views: 255

Answers (1)

jezrael
jezrael

Reputation: 862611

If call function for list of DataFrames, it not working inplace, because in your function are combination inplace and also not inplace functions, but need assign output to new list of DataFrames in loop:

list1 = [A,B,C]
out = []
for i in list1:
    out.append(merge_preprp(i))

Or in list comprehension:

out = [merge_preprp(i) for i in list1]

If in your functions are only inplace operation like last 2 rows for sorting and remove duplicates, your solution working like you need.

Upvotes: 1

Related Questions