Ettore Rizza
Ettore Rizza

Reputation: 2830

Apply a second function if the first one fails

I'm new to Pandas and Numpy. I have a dataframe with which I would like to create a new column by applying a function to each row of a column. Let's take a simplified example:

import pandas as pd
import numpy as np

df = pd.DataFrame(columns=["names"], data=["Brussels", 2, "New York"])

def to_lower(value):
    try:
        return value.lower()
    except AttributeError:
        return None

def to_string(value):
    return str(value)

df['lower_names'] = np.vectorize(to_lower)(df['names'])

This operation works very well. Now I would like to apply the to_string() then the to_lower() only for the lines of "lower_names" where the result is None (I do not know if this is very clear).

This seems very basic, and yet I have trouble. I could detail my attempts, but I am afraid of appearing a moron... Maybe I should bother to learn these two modules one week or two before playing around with them, but in the meantime, any suggestion would be welcome.

Edit : the @jezrael solution is correct... for my simplified example. Now let's imagine that I want to apply the np.vectorize(to_string) function and then np.vectorize(to_lower) only on the rows of the column "names" where the first result is None, what would be the best way to do it?

Upvotes: 1

Views: 54

Answers (1)

jezrael
jezrael

Reputation: 863156

I think you need change return None to return to_string(value):

def to_lower(value):
    try:
        return value.lower()
    except AttributeError:
        return to_string(value)

def to_string(value):
    return str(value)

df['lower_names'] = np.vectorize(to_lower)(df['names'])


print (df['lower_names'].apply(type))
0    <class 'str'>
1    <class 'str'>
2    <class 'str'>
Name: lower_names, dtype: object

Also is possible use astype for convert all values to str and then str.lower:

df['lower_names'] = df['names'].astype(str).str.lower()

Upvotes: 2

Related Questions