Reputation: 15
I am trying to convert a column 'Date' into few columns of 'day of week'...etc. I am not sure why it always gets stuck after about 2000 steps. Because there are quite a lot of data, I would also love to know if there is a faster way of doing so. Thank you.
trainset.head()
Zone_ID Date Hour_slot Hire_count
0 1 2016-02-01 0 0
1 1 2016-02-01 1 0
2 1 2016-02-01 2 0
3 1 2016-02-01 3 0
4 1 2016-02-01 4 0
trainset.shape
(219600, 4)
This is what I have
TrainSet = trainset.copy()
TrainSet['w'] = 0
TrainSet['j'] = 0
TrainSet['U'] = 0
TrainSet['W'] = 0
for i in range(trainset.shape[0]):
TrainSet.loc[i, 'w'] = datetime.datetime.strptime(trainset.loc[i,'Date'], "%Y-%m-%d").strftime('%w')
TrainSet.loc[i, 'j'] = datetime.datetime.strptime(trainset.loc[i,'Date'], "%Y-%m-%d").strftime('%j')
TrainSet.loc[i, 'U'] = datetime.datetime.strptime(trainset.loc[i,'Date'], "%Y-%m-%d").strftime('%U')
TrainSet.loc[i, 'W'] = datetime.datetime.strptime(trainset.loc[i,'Date'], "%Y-%m-%d").strftime('%W')
print(i)
Upvotes: 1
Views: 147
Reputation: 164623
You should use Pandas / NumPy methods with a datetime
series rather than a manual loop. Here's a functional solution using operator.itemgetter
:
from operator import attrgetter
# example dataframe
df = pd.DataFrame({'date': ['2017-05-01 15:00:20', '2018-11-30 10:01:11']})
df['date'] = pd.to_datetime(df['date'])
# list attributes
dt_attrs = ['year', 'hour', 'month', 'day', 'dayofweek']
# extract attributes
attributes = df['date'].apply(attrgetter(*dt_attrs))
# add attributes to dataframe
df[dt_attrs] = pd.DataFrame(attributes.values.tolist())
Result:
date year hour month day dayofweek
0 2017-05-01 15:00:20 2017 15 5 1 0
1 2018-11-30 10:01:11 2018 10 11 30 4
Upvotes: 2