Rahul Kondhalkar
Rahul Kondhalkar

Reputation: 27

apply function to dataframe column

I have data frame x,

Please view the x dataframe here

We want to create new column using below function, which will add Complete columns value in start and create new column finish.

import datetime
def date_by_adding_business_days(from_date, add_days):
    business_days_to_add = add_days
    current_date = from_date
    while business_days_to_add > 0:
        current_date += datetime.timedelta(days=1)
        weekday = current_date.weekday()
        if weekday >= 5: # sunday = 6
            continue
        business_days_to_add -= 1
    return current_date

I have tried this getting below error, please help.

x['Finish'] = x.apply(date_by_adding_business_days(datetime.date.today(), x['Complete']))

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Upvotes: 2

Views: 93

Answers (3)

Anand
Anand

Reputation: 101

Although both approaches above serve the purpose, I think second option is faster where apply() is done against rows of a single column. If you time these solutions in a notebook against the sample dataframe provided in the question, the difference seems obvious.

Attached is a screenshot. Timing apply() function

Upvotes: 0

Abhishek J
Abhishek J

Reputation: 2584

Your approach is correct. You just need to pass the function reference though.

When you call apply. It will pass the dataframe row to the function and call it.

You can compute variables like the date today within the function itself

def date_by_adding_business_days(row):
    add_days = row['Complete']
    from_date = datetime.date.today()

    business_days_to_add = add_days
    current_date = from_date
    while business_days_to_add > 0:
        current_date += datetime.timedelta(days=1)
        weekday = current_date.weekday()
        if weekday >= 5:  # sunday = 6
            continue
        business_days_to_add -= 1
    return current_date

x['Finish'] = x.apply(date_by_adding_business_days, axis=1)

Upvotes: 2

Midriaz
Midriaz

Reputation: 190

Try to refactor your code. If you apply function only to one column, then you do it wrong. Additionally, for some reason you trying to call the function passing time to it. Why if you can just get it right in the function:

import datetime
def date_by_adding_business_days(add_days):
    business_days_to_add = add_days
    current_date = datetime.date.today()
    while business_days_to_add > 0:
        current_date += datetime.timedelta(days=1)
        weekday = current_date.weekday()
        if weekday >= 5: # sunday = 6
            continue
        business_days_to_add -= 1
    return current_date

x['Finish'] = x['Complete'].apply(date_by_adding_business_days)

Upvotes: 2

Related Questions