HelloToEarth
HelloToEarth

Reputation: 2127

Change date-time to a single digit hour ending for dataframe in Python

I have a CSV file that has a column that has values like:

10/23/2018 11:00:00 PM

I want to convert these values strictly by time and create a new column which takes the time of the entry (11:00:00 etc) and changes it into an hour ending time.

Example looks like:

11:00:00 PM to 12:00:00 AM = 24,  12:00:00 AM to 1:00:00 AM = 1, 1:00:00 AM to 2:00:00 AM = 2 .....etc

Looking for a simple way to calculate these by indexing them based off this conversion.

My first pseudo code idea is to do something like grabbing the column df['Date'] and finding out what the time is:

file = pd.read_csv()
def conv(n):
    date_time = n.iloc[1,1]  #Position of the date-time column in file
    for i in date_time:
        time = date_time[11:] #Point of the line where time begins

Unsure how to proceed.

Upvotes: 1

Views: 1769

Answers (3)

Prayson W. Daniel
Prayson W. Daniel

Reputation: 15578

You can also do this:

import pandas as pd

data ='''
10/23/2018 11:00:00 PM
10/23/2018 12:00:00 AM
'''.strip().split('\n')

df = pd.DataFrame(data, columns=['date'])

df['date'] = pd.to_datetime(df['date'])

#df['pad1hour'] = df['date'].dt.hour+1
#or
df['pad1hour'] = df['date'] + pd.Timedelta('1 hours')

# I prefer the second as you can add whatever interval e.g. '1 days 3 minutes'



print(df['pad1hour'].dt.time)

Upvotes: 1

Enrique GO
Enrique GO

Reputation: 53

This is the best way to do it:

from datetime import timedelta
import pandas as pd

file = pd.read_csv()

Case One: If you want to keep the date

file['New datetime'] = file['Date_time'].apply(lambda x: pd.to_datetime(x) + timedelta(hours = 1))

Case Two: If you just want the time

file['New time'] = file['Date_time'].apply(lambda x: (pd.to_datetime(x) + timedelta(hours = 1)).time())

If you need the column's data type as string instead of Timestamp you can just do:

file['New time'] = file['New time'].astype(str) 

To convert it to a readable string.

Hope it helps.

Upvotes: 0

ALollz
ALollz

Reputation: 59569

You should convert to a datetime with pd.to_datetime(df.your_col) (your format will be automatically parsed correctly, though you can specify it to improve the speed) and then you can use the .dt.hour accessor.

import pandas as pd
# Sample Data
df = pd.DataFrame({'date': pd.date_range('2018-01-01', '2018-01-03', freq='30min')})

df['hour'] = df.date.dt.hour+1
print(df.sample(20))

                  date  hour
95 2018-01-02 23:30:00    24
66 2018-01-02 09:00:00    10
82 2018-01-02 17:00:00    18
80 2018-01-02 16:00:00    17
75 2018-01-02 13:30:00    14
83 2018-01-02 17:30:00    18
49 2018-01-02 00:30:00     1
47 2018-01-01 23:30:00    24
30 2018-01-01 15:00:00    16
52 2018-01-02 02:00:00     3
29 2018-01-01 14:30:00    15
86 2018-01-02 19:00:00    20
59 2018-01-02 05:30:00     6
65 2018-01-02 08:30:00     9
92 2018-01-02 22:00:00    23
8  2018-01-01 04:00:00     5
91 2018-01-02 21:30:00    22
10 2018-01-01 05:00:00     6
89 2018-01-02 20:30:00    21
51 2018-01-02 01:30:00     2

Upvotes: 1

Related Questions