Narendran S Nair
Narendran S Nair

Reputation: 13

Type Error: Pandas Dataframe apply function, argument passing

By default, columns are all set to zero. Make entry as 1 at (row,column) where column name string present on URL column

L # list that contains column names used to check if found on URL

Dataframe Image

def generate(statement,col):
    if statement.find(col) == -1:
      return 0
    else:
      return 1

for col in L:
  df3[col].apply(generate, args=(col))

I am a beginner, it throws and error:

/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in f(x)
4195 4196 def f(x): -> 4197 return func(x, *args, **kwds) 4198 4199 else:

TypeError: generate() takes 2 positional arguments but 9 were given

Any suggestions would be helpful

Edit 1:

after,

df3[col].apply(generate, args=(col,))

got error:

> --------------------------------------------------------------------------- AttributeError                            Traceback (most recent call
> last) <ipython-input-162-508036a6e51f> in <module>()
>       1 for col in L:
> ----> 2   df3[col].apply(generate, args=(col,))
> 
> 2 frames pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
> 
> <ipython-input-159-9380ffd36403> in generate(statement, col)
>       1 def generate(statement,col):
> ----> 2     if statement.find(col) == -1:
>       3         return 0
>       4     else:
>       5         return 1
> 
> AttributeError: 'int' object has no attribute 'find'

Edit 2: "I missed to emphasize on URL column in for loop code will rectify that"

Edit 3: Updated and fixed to,

def generate(statement,col):
    if col in str(statement):
        return 1
    else:
        return 0

for col in L:
  df3[col] = df3['url'].apply(generate, col=col)

Thanks for all the support!

Upvotes: 1

Views: 2109

Answers (2)

Saravanakumar V
Saravanakumar V

Reputation: 166

This seems to be a problem with passing parameter in args. args in apply function will take the input as tuples and the same will be passed to the function.

Lets see one example to describe it,

df = pd.DataFrame([['xyz', 'US'],['abc', 'MX'],['xyz', 'CA']], columns = ["Name", "Country"])

print(df)

Name    Country
xyz     US
abc     MX
xyz     CA

Create a function as required with extra arguments,

def generate(statement,col):
    if statement.find(col) == -1:
        return 0
    else:
        return 1

Consider L as the list, ['Name', 'Country']

Now, Lets apply the function generate with extra arguments in loop

for col in L:
    print(df[col].apply(generate, args=(col)))


TypeError: generate() takes 2 positional arguments but 5 were given

Now, we could see the error occurs because (col) is a single element in tuple and so the args will take input as args=('N', 'A', 'M', 'E'). Along with statement now extra 4 inputs were given instead of just 1.

To avoid this situation, you can follow either of the below options

  1. Assign the col value to the parameter itself directly
df[col].apply(generate, col=col)
  1. Pass the arguments in tuple separated by commas. Note that for a single element tuple add one comma at the end.
df[col].apply(generate, args=(col,))

Upvotes: 0

Ivan Gorin
Ivan Gorin

Reputation: 391

When creating a 1 element tuple, you need a comma after the element: args=(col,), otherwise the parentheses are just ignored.

Upvotes: 2

Related Questions