Reputation: 2127
I have a CSV file that has a column that has values like:
10/23/2018 11:00:00 PM
I want to convert these values strictly by time and create a new column which takes the time of the entry (11:00:00 etc) and changes it into an hour ending time.
Example looks like:
11:00:00 PM to 12:00:00 AM = 24, 12:00:00 AM to 1:00:00 AM = 1, 1:00:00 AM to 2:00:00 AM = 2 .....etc
Looking for a simple way to calculate these by indexing them based off this conversion.
My first pseudo code idea is to do something like grabbing the column df['Date'] and finding out what the time is:
file = pd.read_csv()
def conv(n):
date_time = n.iloc[1,1] #Position of the date-time column in file
for i in date_time:
time = date_time[11:] #Point of the line where time begins
Unsure how to proceed.
Upvotes: 1
Views: 1769
Reputation: 15578
You can also do this:
import pandas as pd
data ='''
10/23/2018 11:00:00 PM
10/23/2018 12:00:00 AM
'''.strip().split('\n')
df = pd.DataFrame(data, columns=['date'])
df['date'] = pd.to_datetime(df['date'])
#df['pad1hour'] = df['date'].dt.hour+1
#or
df['pad1hour'] = df['date'] + pd.Timedelta('1 hours')
# I prefer the second as you can add whatever interval e.g. '1 days 3 minutes'
print(df['pad1hour'].dt.time)
Upvotes: 1
Reputation: 53
This is the best way to do it:
from datetime import timedelta
import pandas as pd
file = pd.read_csv()
file['New datetime'] = file['Date_time'].apply(lambda x: pd.to_datetime(x) + timedelta(hours = 1))
file['New time'] = file['Date_time'].apply(lambda x: (pd.to_datetime(x) + timedelta(hours = 1)).time())
If you need the column's data type as string instead of Timestamp you can just do:
file['New time'] = file['New time'].astype(str)
To convert it to a readable string.
Hope it helps.
Upvotes: 0
Reputation: 59569
You should convert to a datetime
with pd.to_datetime(df.your_col)
(your format will be automatically parsed correctly, though you can specify it to improve the speed) and then you can use the .dt.hour
accessor.
import pandas as pd
# Sample Data
df = pd.DataFrame({'date': pd.date_range('2018-01-01', '2018-01-03', freq='30min')})
df['hour'] = df.date.dt.hour+1
print(df.sample(20))
date hour
95 2018-01-02 23:30:00 24
66 2018-01-02 09:00:00 10
82 2018-01-02 17:00:00 18
80 2018-01-02 16:00:00 17
75 2018-01-02 13:30:00 14
83 2018-01-02 17:30:00 18
49 2018-01-02 00:30:00 1
47 2018-01-01 23:30:00 24
30 2018-01-01 15:00:00 16
52 2018-01-02 02:00:00 3
29 2018-01-01 14:30:00 15
86 2018-01-02 19:00:00 20
59 2018-01-02 05:30:00 6
65 2018-01-02 08:30:00 9
92 2018-01-02 22:00:00 23
8 2018-01-01 04:00:00 5
91 2018-01-02 21:30:00 22
10 2018-01-01 05:00:00 6
89 2018-01-02 20:30:00 21
51 2018-01-02 01:30:00 2
Upvotes: 1