AlliDeacon
AlliDeacon

Reputation: 1495

Apply on Dataframe returning all None

I am trying to multiply the values of a column by 12 if that row/column isn't None.

I have tried:

def length_inches(x):
    if x is not None:
        int(x)*12

df['LENGTH'] = df['LENGTH'].notnull().apply(length_inches)

And I have tried:

def length_inches(x):
    int(x)*12

df['LENGTH'] = df['LENGTH'].notnull().apply(length_inches)

But it's returning all None in the Length column.

Here is my dataframe:

                                          DESCRIPTION  LENGTH  WIDTH   GAUGE  \
0   STRETCH FILM BENCHMARK GREEN   28.5" X 10000' ...   10000  28.5      51    
1   STRETCH FILM TORQUE            16X1500 4RL/CS ...    1500    16    31.5    
2   STRETCH FILM TORQUE            16X1500 4RL/CS ...    1500    16    31.5    
3   STRETCH FILM TORQUE            16X1500 4RL/CS ...    1500    16    31.5    
4   STRETCH FILM BENCHMARK OPTIMUM 30 X 7500'  20R...    7500    30      61    
5   STRETCH FILM TORQUE            16X1500 4RL/CS ...    1500    16    31.5    
6   STRETCH FILM TORQUE            16X1500 4RL/CS ...    1500    16    31.5    
7   STRETCH FILM BENCHMARK OPTIMUM 20" X 7500'  40...    None   None    None   

How can I account for the None's in this dataframe and still run the calculation over df['LENGTH']

The type of that series is LENGTH object

If that row is None I would like to just pass.

Upvotes: 0

Views: 1111

Answers (4)

Pyd
Pyd

Reputation: 6159

 df['LENGTH']=df['LENGTH'].replace('None',0).astype(int)*12

Upvotes: 1

jpp
jpp

Reputation: 164713

pd.Series.notnull outputs a Boolean series depending on whether your series is null. It doesn't filter a series for non-null values. In fact, this explicit filtering is not necessary. You should use vectorised calculations, as described below, when working with numeric data in Pandas.

There are likely only a couple of scenarios you need to consider:

1. float series => no conversion

If your series is float, i.e. df['LENGTH'].dtype returns a float type, don't perform any conversion or checking. Just use:

df['LENGTH'] *= 12

2. object series => use pd.to_numeric

If your series is object type, convert it to float first:

df['LENGTH'] = pd.to_numeric(df['LENGTH'], errors='coerce')
df['LENGTH'] *= 12

pd.Series.apply with a custom function, on the other hand, is not vectorised: internally, it's just a thinly veiled loop. Avoid it like the plague.

Upvotes: 2

zipa
zipa

Reputation: 27879

You din't return anything from your functions (you returned None):

def length_inches(x):
    if x is not None:
        return int(x)*12
    else:
        return None

df['LENGTH'].apply(length_inches)

Upvotes: 3

JimmyA
JimmyA

Reputation: 686

You must return a value at the end of your function.

Try:

def length_inches(x):
    if x is not None:
         return int(x)*12

Upvotes: 1

Related Questions