Reputation: 21355
In a dataframe with two columns I can easily create a third without a function if it is a numerical operation such as multiply df["new"] =df["one"] * df["two"].
However what if I need to pass in more than two parameters to a function and those parameters are columns from a dataframe.
Passing one column at a time is simple using: df.apply(my_func) but if the functions definition is, and requires three columns:
def WordLength(col1,col2,col3):
return max(len(col1),len(col2),len(col3))
For example, A function WordLength would return the maximum length of the word from any of the three columns passed into it.
I know for example this doesn't work but I imagine something like this to return a result of a function requiring three parameters into a dataframe column:
df["word_length"]= df.apply(WordLength, [[param1,param2,param3]])
Update Jon, when trying to use your method of passing in three parameters (values from three dataframe columns for a given row I am getting the following error:
def get(name,start_date,end_date):
try:
df = ...
response = df.apply(get, axis=1, args=('name', 'date', 'today'))
Error relating to arguments - I don't understand why it mentions 4 arguments when I have passed in three and the function only requires three arguments...
Error:
TypeError: ('getprice() takes exactly 3 arguments (4 given)', u'occurred at index 0')
Upvotes: 1
Views: 12504
Reputation: 142206
Unless you really want a function to do this, you can use DataFrame
operations, eg:
df[['col1', 'col2', 'col3']].applymap(len).max(axis=1)
You can use apply
's args
argument to pass in the columns to be processed and make the target function take a variable number of arguments for unpacking, eg:
def max_word_length(row, *cols):
return row[list(cols)].map(len).max()
# Make sure `axis=1` so rows are passed in and we can access columns
df.apply(max_word_length, axis=1, args=('col1', 'col2', 'col3'))
Upvotes: 1
Reputation: 3855
I think you need a lambda
function in your apply:
def WordLength(words):
return max(len(words[0]),len(words[1]),len(words[2]))
df['wordlength'] = df[['col1','col2','col3']].apply(lambda x: WordLength(x),axis=1)
Output:
col1 col2 col3 wordlength
0 word1 word10 wordover9000 12
1 anotherword wooooord test 11
2 yetanotherword letter Ihavenootheridea 16
Upvotes: 3