connor449
connor449

Reputation: 1679

Inputting arguments into df.apply function

I know this is a commonly asked question, but I'm still confused despite many SO posts. This is my problem:

I have this function:

def query_text_by_keyword(df, word_list):
    for word in word_list:
        if word in df.words:
            match = True
        else:
            match = False
        return match

master_df['neg_query_match'] = master_df.apply(query_text_by_keyword, axis=1, args=(master_df, neg_words))

My function accepts 2 args, a df with the column 'words' (values are strings of text) and a word_list (a list of strings). I want to loop through each word in word_list and see whether this word is in each row from df.words. If it is, I want to create a column that labels this row as True. However, I keep getting this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-64-8ccf9cc7c0c6> in <module>
      7         return match
      8 
----> 9 master_df['neg_query_match'] = master_df.apply(query_text_by_keyword, axis=1, args=(master_df, neg_words))

C:\Anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, broadcast, raw, reduce, result_type, args, **kwds)
   6904             kwds=kwds,
   6905         )
-> 6906         return op.get_result()
   6907 
   6908     def applymap(self, func):

C:\Anaconda3\lib\site-packages\pandas\core\apply.py in get_result(self)
    184             return self.apply_raw()
    185 
--> 186         return self.apply_standard()
    187 
    188     def apply_empty_result(self):

C:\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard(self)
    290 
    291         # compute the result using the series generator
--> 292         self.apply_series_generator()
    293 
    294         # wrap results

C:\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)
    319             try:
    320                 for i, v in enumerate(series_gen):
--> 321                     results[i] = self.f(v)
    322                     keys.append(v.name)
    323             except Exception as e:

C:\Anaconda3\lib\site-packages\pandas\core\apply.py in f(x)
    110 
    111             def f(x):
--> 112                 return func(x, *args, **kwds)
    113 
    114         else:

TypeError: ('query_text_by_keyword() takes 2 positional arguments but 3 were given', 'occurred at index 0')

What's going on here? The SO posts regarding supplying args to a df.apply function recommend this format.

Upvotes: 0

Views: 58

Answers (1)

Mike M&#252;ller
Mike M&#252;ller

Reputation: 85522

The help says:

args : tuple Positional arguments to pass to func in addition to the array/series.

So pandas hands in the dataframe automatically.

So change your code to:

master_df['neg_query_match'] = master_df.apply(query_text_by_keyword, 
                                               axis=1, 
                                               args=(neg_words,))

Note: A tuple with one element needs a trailing comma.

Upvotes: 1

Related Questions