the_RR
the_RR

Reputation: 402

How to merge pandas dataframe passing a lambda as first parameter?

Restricting to pandas method chaining, how to apply merge method using last dataframe state with lambda function without using pipe?

The code below works. But it depends on the pipe method.

(pd.DataFrame(
    [{'YEAR':2013,'FK':1, 'v':1},
     {'YEAR':2013,'FK':2, 'v':2},
     {'YEAR':2014,'FK':1, 'v':3},
     {'YEAR':2014,'FK':2, 'v':4}
    ])
  .pipe(lambda w: w.merge(w.query('YEAR==2013')[['FK','v']],
        on='FK',
        how='left'
       ))
)

The code below doesn't work.

(pd.DataFrame(
    [{'YEAR':2013,'FK':1, 'v':1},
     {'YEAR':2013,'FK':2, 'v':2},
     {'YEAR':2014,'FK':1, 'v':3},
     {'YEAR':2014,'FK':2, 'v':4}
    ])
 .merge(lambda w: w.query('YEAR==2013'),
        on='FK',
        how='left'
       )
)

Return: TypeError: Can only merge Series or DataFrame objects, a <class 'function'> was passed

Upvotes: 1

Views: 210

Answers (1)

mozway
mozway

Reputation: 262224

You can't, this is precisely why the pipe method exists.

For completeness, DataFrame methods/accessors that accept a callable (as primary parameter and as of pandas 2.0.3) are:

For other cases, you need to use pipe.

Upvotes: 3

Related Questions