Reputation: 13
By default, columns are all set to zero. Make entry as 1 at (row,column) where column name string present on URL column
L # list that contains column names used to check if found on URL
def generate(statement,col):
if statement.find(col) == -1:
return 0
else:
return 1
for col in L:
df3[col].apply(generate, args=(col))
I am a beginner, it throws and error:
/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in f(x)
4195 4196 def f(x): -> 4197 return func(x, *args, **kwds) 4198 4199 else:TypeError: generate() takes 2 positional arguments but 9 were given
Any suggestions would be helpful
Edit 1:
after,
df3[col].apply(generate, args=(col,))
got error:
> --------------------------------------------------------------------------- AttributeError Traceback (most recent call
> last) <ipython-input-162-508036a6e51f> in <module>()
> 1 for col in L:
> ----> 2 df3[col].apply(generate, args=(col,))
>
> 2 frames pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
>
> <ipython-input-159-9380ffd36403> in generate(statement, col)
> 1 def generate(statement,col):
> ----> 2 if statement.find(col) == -1:
> 3 return 0
> 4 else:
> 5 return 1
>
> AttributeError: 'int' object has no attribute 'find'
Edit 2: "I missed to emphasize on URL column in for loop code will rectify that"
Edit 3: Updated and fixed to,
def generate(statement,col):
if col in str(statement):
return 1
else:
return 0
for col in L:
df3[col] = df3['url'].apply(generate, col=col)
Thanks for all the support!
Upvotes: 1
Views: 2109
Reputation: 166
This seems to be a problem with passing parameter in args
. args
in apply
function will take the input as tuples and the same will be passed to the function.
Lets see one example to describe it,
df = pd.DataFrame([['xyz', 'US'],['abc', 'MX'],['xyz', 'CA']], columns = ["Name", "Country"])
print(df)
Name Country
xyz US
abc MX
xyz CA
Create a function as required with extra arguments,
def generate(statement,col):
if statement.find(col) == -1:
return 0
else:
return 1
Consider L as the list, ['Name', 'Country']
Now, Lets apply the function generate
with extra arguments in loop
for col in L:
print(df[col].apply(generate, args=(col)))
TypeError: generate() takes 2 positional arguments but 5 were given
Now, we could see the error occurs because (col)
is a single element in tuple and so the args will take input as args=('N', 'A', 'M', 'E')
. Along with statement
now extra 4 inputs were given instead of just 1.
To avoid this situation, you can follow either of the below options
col
value to the parameter itself directlydf[col].apply(generate, col=col)
df[col].apply(generate, args=(col,))
Upvotes: 0
Reputation: 391
When creating a 1 element tuple, you need a comma after the element: args=(col,), otherwise the parentheses are just ignored.
Upvotes: 2