Arefe
Arefe

Reputation: 12481

How to use the lambda properly with pandas df?

I have few pandas df and those will need to be processed with few functions one after another. So, for a certain df after passing through the first function, the generated df will need to be processed with the sequential functions in the same way. The pseudo code might be as following,

df =  func_1(df)
df =  func_2(df)
df =  func_3(df) 

and later be appended in a list, like lis.append(df). With the loop, the code might like below,

storage = []
for df in dfs:
    for func in functions:
        df = func(df)
    storage.append(df)

 df = pd.concat(storage, ignore_index=True)

Eventually, I may use concat to produce the final df. The above code is what it suppose to be and my code is below. While the code runs perfectly, its not the same way the I intend.

storage = [map(lambda df: f(df) , dfs) for f in functions][-1]
df = pd.concat(storage, ignore_index=True)

For the storage, the right hand side produces an list with elements number equal to the dfs and the last element (-1) happens to be the List Of 1 Element process with the dfs with all the functions and conjuncted together. Here is the distinction, the previous storage list produce with all the dfs proceeded with the functions and the element number will be the same as the dfs.

My question: how can I use lambda function to get the storage list as I get before ? The list should contain the same number of elements equal to the number of the dfs and should be processed with all the functions.

I will try to explain it better if you have questions, but, please, consult before downvote.

Upvotes: 1

Views: 107

Answers (1)

CaptainTrunky
CaptainTrunky

Reputation: 1707

Ok, let's break your task into pieces.

At first, you want to apply a list of functions to a same object. I'll stick with the simple example, but idea is the same. First, let's define a helper function, which accepts a list of functions and applies them to each other:

 apply_rec = lambda f, d: f[0](d) if len(f) == 1 else apply_rec (f[1:], f[0](d))

It accepts a list of functions and initial data. If the list contains exactly one function, it applies it to the data and returns the result; otherwise, it 'cuts' away first function from the list, apples it to data and calls itself with these inputs.

Let's try it:

 data = 10
 fs = [lambda x = x+1, lambda x = x * 2, lambda x = x ** 2]
 >>> apply_rec (fs, data)
484

Now you can do the following:

storage = [apply_rec (functions, df) for df in dfs]

I hope that understood your problem correctly.

Upvotes: 1

Related Questions