PyCoder
PyCoder

Reputation: 31

Reading time from file

I need help in grouping my data by date to find Sum of Days worked and Total Hours worked for each ID. My data is in a txt file. An example of my data is provided below :

ID,Date,Start Time,End Time
1,02/11/2019,08:25,16:24
1,03/11/2019,07:54,16:20
1,04/11/2019,07:46,16:08
1,05/11/2019,08:13,16:10
1,06/11/2019,07:48,15:24
1,08/11/2019,08:27,15:26
1,09/11/2019,08:25,16:02
1,10/11/2019,07:49,15:03
1,11/11/2019,08:18,15:29
1,12/11/2019,07:54,16:21
1,15/11/2019,07:50,16:24
1,16/11/2019,07:48,16:25
1,17/11/2019,08:25,15:28
1,19/11/2019,07:49,16:10
1,21/11/2019,08:11,15:06
1,22/11/2019,07:58,15:29
1,23/11/2019,07:49,15:20
1,24/11/2019,08:28,16:28
1,26/11/2019,08:29,16:14
1,27/11/2019,07:59,15:18
1,28/11/2019,07:53,16:24
1,29/11/2019,07:49,16:25
1,30/11/2019,08:25,15:08
1,31/11/2019,08:10,16:09
2,01/11/2019,07:58,15:22
2,02/11/2019,07:53,16:16
2,03/11/2019,08:13,15:05
2,04/11/2019,07:54,16:04

An example of the final output I require is provided below:

ID,Hours Worked,Days Worked
1,168,24 

I have been trying for a week using lists and dictionaries but have not been able to come up with a solution. Any help using datetime methods is much appreciated!

Upvotes: 0

Views: 2327

Answers (2)

KVEER
KVEER

Reputation: 123

If you can use the pandas library, here's something you can do

import pandas as pd
df=pd.read_csv('filename.csv')
df['Duration']=(pd.to_datetime(df['End Time'], format='%H:%M')-pd.to_datetime(df['Start Time'], format='%H:%M')).dt.components['hours']
df.groupby('ID').Duration.agg('sum')
result=df.groupby('ID').agg({'Duration': ['sum'], 'Date': ['count']})

Here, we read the file into a pandas dataframe, create a column with timedelta values for the start and end times, then group by the 'ID' and aggregate the sum over column 'Duration' and the count over column 'Date' as this gives the number of days worked.

The output should look something like this -

   Duration  Date
        sum count
ID               
1       174    24
2        29     4

Upvotes: 2

Yukun Li
Yukun Li

Reputation: 254

if you are sure that timestamp in your input for each id does not duplicate or overlap

this solution store id: hout works and should be easy to transfer hour -> day

from datetime import datetime

testlist = [
    [1,"02/11/2019","08:25","16:25"],
    [1,"03/11/2019","07:25","16:25"],
    [2,"02/13/2019","08:25","17:25"],
    [2,"03/13/2019","06:25","16:25"]
]

fmt = '%H:%M'
dictx = {}
for time in testlist:
    if time[0] not in dictx:
        dictx[time[0]] = 0
    tstamp1 = datetime.strptime(time[2], fmt)
    tstamp2 = datetime.strptime(time[3], fmt)
    dictx[time[0]] += int(round((tstamp2 - tstamp1).total_seconds()) / 3600)

print(dictx)

Upvotes: 0

Related Questions