Reputation: 774
I'm new to python and pandas but I have a problem I cannot wrap my head around.
I'm trying to add a new column to my DataFrame
. To achieve that I use the assign()
function.
Most of the examples on the internet are painfully trivial and I cannot find a solution for my problem.
What works:
my_dataset.assign(new_col=lambda x: my_custom_long_function(x['long_column']))
def my_custom_long_function(input)
return input * 2
What doesn't work:
my_dataset.assign(new_col=lambda x: my_custom_string_function(x['string_column'])
def my_custom_string_function(input)
return input.upper()
What confuses me is that in the debug I can see that even for my_custom_long_function
the parameter is a Series
, not a long
.
I just want to use the lambda
function and pass a value of the column to do my already written complicated functions. How do I do this?
Edit: The example here is just for demonstrative purpose, the real code is basically an existing complex function that does not care about panda's types and needs a str
as a parameter.
Upvotes: 5
Views: 16043
Reputation: 71580
Because the column doesn't have a upper
method, in order to use it, you need to do str.upper
:
my_dataset.assign(new_col=lambda x: my_custom_string_function(x['string_column'])
def my_custom_string_function(input)
return input.str.upper()
That said, I would use:
my_dataset['new column'] = my_dataset['string_column'].str.upper()
For efficiency.
Edit:
my_dataset['new column'] = my_dataset['string_column'].apply(lambda x: my_custom_string_function(x))
def my_custom_string_function(input):
return input.upper()
Upvotes: 6