Reputation: 2373
I am trying to create a new column in a pandas data frame by and calculating the value from existing columns.
I have 3 existing columns ("launched_date", "item_published_at", "item_created_at")
However, my "if row[column_name] is not None:" statement is allowing columns with NaN value and not skipping to the next statement.
In the code below, I would not expect the value of "nan" to be printed after the first conditional, I would expect something like "2018-08-17"
df['adjusted_date'] = df.apply(lambda row: adjusted_date(row), axis=1)
def adjusted_launch(row):
if row['launched_date']is not None:
print(row['launched_date'])
exit()
adjusted_date = date_to_time_in_timezone(row['launched_date'])
elif row['item_published_at'] is not None:
adjusted_date = row['item_published_at']#make datetime in PST
else:
adjusted_date = row['item_created_at'] #make datetime in PST
return adjusted_date
How can I structure this conditional statement correctly?
Upvotes: 1
Views: 27542
Reputation: 3749
First fill "nan" as string where the data is empty
df.fillna("nan",inplace=True)
Then in function you can apply if condition like:
def adjusted_launch(row):
if row['launched_date'] !='nan':
......
import numpy as np
df.fillna(np.nan,inplace=True)
#suggested by @ShadowRanger
def funct(row):
if row['col'].notnull():
pass
Upvotes: 8
Reputation: 631
df = df.where((pd.notnull(df)), None)
This will replace all nans with None, No other modifications required.
Upvotes: 2