Reputation: 75
The datetime is given in the format YY-MM-DD HH:MM:SS in a dataframe.I want new Series of year,month and hour for which I am trying the below code. But the problem is that Month and Hour are getting the same value,Year is fine.
Can anyone help me with this ? I am using Ipthon notebook and Pandas and numpy.
Here is the code :
def extract_hour(X):
cnv=datetime.strptime(X, '%Y-%m-%d %H:%M:%S')
return cnv.hour
def extract_month(X):
cnv=datetime.strptime(X, '%Y-%m-%d %H:%M:%S')
return cnv.month
def extract_year(X):
cnv=datetime.strptime(X, '%Y-%m-%d %H:%M:%S')
return cnv.year
#month column
train['Month']=train['datetime'].apply((lambda x: extract_month(x)))
test['Month']=test['datetime'].apply((lambda x: extract_month(x)))
#year column
train['Year']=train['datetime'].apply((lambda x: extract_year(x)))
test['Year']=test['datetime'].apply((lambda x: extract_year(x)))
#Hour column
train['Hour']=train['datetime'].apply((lambda x: extract_hour(x)))
test['Hour']=test['datetime'].apply((lambda x: extract_hour(x)))
Upvotes: 1
Views: 236
Reputation: 210862
you can use .dt
accessors instead: train['datetime'].dt.month
, train['datetime'].dt.year
, train['datetime'].dt.hour
(see the full list below)
Demo:
In [81]: train = pd.DataFrame(pd.date_range('2016-01-01', freq='1999H', periods=10), columns=['datetime'])
In [82]: train
Out[82]:
datetime
0 2016-01-01 00:00:00
1 2016-03-24 07:00:00
2 2016-06-15 14:00:00
3 2016-09-06 21:00:00
4 2016-11-29 04:00:00
5 2017-02-20 11:00:00
6 2017-05-14 18:00:00
7 2017-08-06 01:00:00
8 2017-10-28 08:00:00
9 2018-01-19 15:00:00
In [83]: train.datetime.dt.year
Out[83]:
0 2016
1 2016
2 2016
3 2016
4 2016
5 2017
6 2017
7 2017
8 2017
9 2018
Name: datetime, dtype: int64
In [84]: train.datetime.dt.month
Out[84]:
0 1
1 3
2 6
3 9
4 11
5 2
6 5
7 8
8 10
9 1
Name: datetime, dtype: int64
In [85]: train.datetime.dt.hour
Out[85]:
0 0
1 7
2 14
3 21
4 4
5 11
6 18
7 1
8 8
9 15
Name: datetime, dtype: int64
In [86]: train.datetime.dt.day
Out[86]:
0 1
1 24
2 15
3 6
4 29
5 20
6 14
7 6
8 28
9 19
Name: datetime, dtype: int64
List of all .dt
accessors:
In [77]: train.datetime.dt.
train.datetime.dt.ceil train.datetime.dt.hour train.datetime.dt.month train.datetime.dt.to_pydatetime
train.datetime.dt.date train.datetime.dt.is_month_end train.datetime.dt.nanosecond train.datetime.dt.tz
train.datetime.dt.day train.datetime.dt.is_month_start train.datetime.dt.normalize train.datetime.dt.tz_convert
train.datetime.dt.dayofweek train.datetime.dt.is_quarter_end train.datetime.dt.quarter train.datetime.dt.tz_localize
train.datetime.dt.dayofyear train.datetime.dt.is_quarter_start train.datetime.dt.round train.datetime.dt.week
train.datetime.dt.days_in_month train.datetime.dt.is_year_end train.datetime.dt.second train.datetime.dt.weekday
train.datetime.dt.daysinmonth train.datetime.dt.is_year_start train.datetime.dt.strftime train.datetime.dt.weekday_name
train.datetime.dt.floor train.datetime.dt.microsecond train.datetime.dt.time train.datetime.dt.weekofyear
train.datetime.dt.freq train.datetime.dt.minute train.datetime.dt.to_period train.datetime.dt.year
Upvotes: 1