Reputation: 2830
I'm new to Pandas and Numpy. I have a dataframe with which I would like to create a new column by applying a function to each row of a column. Let's take a simplified example:
import pandas as pd
import numpy as np
df = pd.DataFrame(columns=["names"], data=["Brussels", 2, "New York"])
def to_lower(value):
try:
return value.lower()
except AttributeError:
return None
def to_string(value):
return str(value)
df['lower_names'] = np.vectorize(to_lower)(df['names'])
This operation works very well. Now I would like to apply the to_string()
then the to_lower()
only for the lines of "lower_names" where the result is None (I do not know if this is very clear).
This seems very basic, and yet I have trouble. I could detail my attempts, but I am afraid of appearing a moron... Maybe I should bother to learn these two modules one week or two before playing around with them, but in the meantime, any suggestion would be welcome.
Edit : the @jezrael solution is correct... for my simplified example. Now let's imagine that I want to apply the np.vectorize(to_string)
function and then np.vectorize(to_lower)
only on the rows of the column "names" where the first result is None, what would be the best way to do it?
Upvotes: 1
Views: 54
Reputation: 863156
I think you need change return None
to return to_string(value)
:
def to_lower(value):
try:
return value.lower()
except AttributeError:
return to_string(value)
def to_string(value):
return str(value)
df['lower_names'] = np.vectorize(to_lower)(df['names'])
print (df['lower_names'].apply(type))
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
Name: lower_names, dtype: object
Also is possible use astype
for convert all values to str
and then str.lower
:
df['lower_names'] = df['names'].astype(str).str.lower()
Upvotes: 2