Mac
Mac

Reputation: 123

Appending to a DataFrame line by line, using a for loop and if statement

I'm struggling to figure out a way iterate through a dataframe, and use an if statement to determine whether a day is a weekday or weekend (and replicate, using the holidays package to find holidays)

Test Data

|    Date    |col1 |col2 |col3 |----| week | <- Add
|------------|-----|-----|-----|----|------|
| 2019-10-27 | 1.5 | 2.4 | 1.9 |----| wknd |
| 2019-10-28 | 2.7 | 2.1 | 1.9 |----| wkdy |
| 2019-10-29 | 1.4 | 2.7 | 1.8 |----| wkdy |

My example df has a date column, and 3 columns of data. The actual dataset has around 5 years of data. I want to add some conditional that checks to see if that date is a weekend or weekday, and adds that as a new column. I have no issues iterating and checking the dates, but I can't find anything that allows me to append or write to the dataframe on a line by line basis.

I had something like this so far:

for i in df.date:
    if i.weekday() < 5:
        df['period2'] = ('wkdy')
    else:
        df['period2'] = ('wknd')

Obviously this doesn't meet my needs, as it just sets all columns in the DF to the last iterated value.

What is the best way to iterate through and apply this logic on a line by line basis? Am I missing something simple?

Upvotes: 0

Views: 61

Answers (3)

libin
libin

Reputation: 457

you can use apply. it return a series or a DataFrame,

df['week'] = df['date'].apply(lambda x: "wkdy" if x.weekday() < 5 else "wked")

there are some other function can do it, like df.map, df.aggreagte, because aggreagte if basic of apply.

Upvotes: 0

fmarm
fmarm

Reputation: 4284

You can use np.where

import numpy as np
df['period2'] = np.where(df['date'].dt.weekday < 5,'wkdy','wknd')

If you really want to use a for loop you should loop over the rows by doing

for i in range(df.shape[0]):
    ....

Upvotes: 2

Wavy
Wavy

Reputation: 188

I'd try to use the loc function and apply the functions to entire columns at once so the operation is vectorized and much quicker to execute:

df['week'] = df['date'].weekday()
df['week'].loc[df['week'] < 5] = 'wkday'
df['week'].loc[df['week'] >= 5] = 'wknd'

Upvotes: 0

Related Questions