Reputation: 490
I am looking for a simplest solution to calculate time. I have a sample following list of lists. And I need to calculate for each day end time - start time. E.g. 2018-07-1 17:00 - 08:00 = 09:00. I tried with a lot loops and iterate with itertools.combinations but it always fails.
[['2018-07-01', '8:00', 'IN'],
['2018-07-01', '12:00', 'OUT'],
['2018-07-01', '12:30', 'IN'],
['2018-07-01', '17:00', 'OUT'],
['2018-07-02', '8:00', 'IN'],
['2018-07-02', '12:00', 'OUT'],
['2018-07-02', '12:30', 'IN'],
['2018-07-02', '17:00', 'OUT'],
['2018-07-03', '8:00', 'IN'],
['2018-07-03', '12:00', 'OUT'],
['2018-07-03', '12:30', 'IN'],
['2018-07-03', '17:00', 'OUT'],
['2018-07-04', '8:00', 'IN'],
['2018-07-04', '17:00', 'OUT']]
My attempts:
for idx, elemenet in enumerate(test):
try:
if elemenet[0] == test[idx + 1][0]:
print(elemenet)
except:
pass
index = 0
for a, b in itertools.combinations(test, 2):
if a[0] and b[0] and a[2] == 'IN' and b[2] == 'OUT':
print(a , b)
index += 1
print(index)
Upvotes: 0
Views: 278
Reputation: 523
dates = [['2018-07-01', '8:00', 'IN'],
['2018-07-01', '12:00', 'OUT'],
['2018-07-01', '12:30', 'IN'],
['2018-07-01', '17:00', 'OUT'],
['2018-07-02', '8:00', 'IN'],
['2018-07-02', '12:00', 'OUT'],
['2018-07-02', '12:30', 'IN'],
['2018-07-02', '17:00', 'OUT'],
['2018-07-03', '8:00', 'IN'],
['2018-07-03', '12:00', 'OUT'],
['2018-07-03', '12:30', 'IN'],
['2018-07-03', '17:00', 'OUT'],
['2018-07-04', '8:00', 'IN'],
['2018-07-04', '17:00', 'OUT']]
totalTime = dict()
for item in dates:
date = item[0]
hr, min = item[1].split(':')
time = float(hr) * 60 + float(min)
inout = item[2]
if not date in totalTime:
totalTime[date] = 0
if(inout == 'IN'):
totalTime[date] -= time
else:
totalTime[date] += time
for date, time in totalTime.iteritems():
print(date, time/60)
Output:
('2018-07-04', 9.0)
('2018-07-01', 8.5)
('2018-07-02', 8.5)
('2018-07-03', 8.5)
Upvotes: 1
Reputation: 729
With a plain python code this will be done like....
from datetime import datetime
l=[['2018-07-01', '8:00', 'IN'],
['2018-07-01', '12:00', 'OUT'],
['2018-07-01', '12:30', 'IN'],
['2018-07-01', '17:00', 'OUT'],
['2018-07-02', '8:00', 'IN'],
['2018-07-02', '12:00', 'OUT'],
['2018-07-02', '12:30', 'IN'],
['2018-07-02', '17:00', 'OUT'],
['2018-07-03', '8:00', 'IN'],
['2018-07-03', '12:00', 'OUT'],
['2018-07-03', '12:30', 'IN'],
['2018-07-03', '17:00', 'OUT'],
['2018-07-04', '8:00', 'IN'],
['2018-07-04', '17:00', 'OUT']]
def sortt(key1,key2):
dt=key1.split('-')
tt=key2.split(':')
return datetime(int(dt[0]),int(dt[1]),int(dt[2]),int(tt[0]),int(tt[1]))
sortedlist=sorted(l,key=lambda x: sortt(x[0],x[1]))
currentDate=sortedlist[0][0]
currentTime=sortedlist[0][1]
for i in range(1,len(sortedlist)):
if currentDate!=sortedlist[i][0] or i==len(sortedlist)-1:
if i==len(sortedlist)-1:
print(currentDate+' '+sortedlist[i-1][1]+'-'+currentTime)
break
else:
print(currentDate+' '+currentTime+'-'+sortedlist[i-1][1])
currentDate=sortedlist[i+1][0]
currentTime=sortedlist[i+1][1]
OUTPUT :
2018-07-01 8:00-17:00
2018-07-02 12:00-17:00
2018-07-03 12:00-17:00
2018-07-04 8:00-17:00
Upvotes: 0
Reputation: 12015
Here is a solution using itertools.groupby
for python3
>>> lst = [['2018-07-01', '8:00', 'IN'], ['2018-07-01', '12:00', 'OUT'], ['2018-07-01', '12:30', 'IN'], ['2018-07-01', '17:00', 'OUT'], ['2018-07-02', '8:00', 'IN'], ['2018-07-02', '12:00', 'OUT'], ['2018-07-02', '12:30', 'IN'], ['2018-07-02', '17:00', 'OUT'], ['2018-07-03', '8:00', 'IN'], ['2018-07-03', '12:00', 'OUT'], ['2018-07-03', '12:30', 'IN'], ['2018-07-03', '17:00', 'OUT'], ['2018-07-04', '8:00', 'IN'], ['2018-07-04', '17:00', 'OUT']]
>>>
>>> from datetime import datetime
>>> from itertools import groupby
>>> to_time = lambda s: datetime.strptime(s, '%H:%M')
>>> diff_time = lambda s1, s2: str(to_time(s1)-to_time(s2))
>>>
>>> res = {date:diff_time(last[1], first[1]) for date,(first,*_,last) in groupby(lst, lambda x: x[0])}
>>> pprint(res)
{'2018-07-01': '9:00:00',
'2018-07-02': '9:00:00',
'2018-07-03': '9:00:00',
'2018-07-04': '9:00:00'}
For python2, you need to replace res =
line with these two lines
>>> res = {date:list(times) for date,times in groupby(lst, lambda x: x[0])}
>>> res = {date:diff_time(times[-1][1], times[0][1]) for date,times in res.items()}
Upvotes: 1
Reputation: 10306
I'm assuming that what you want is the difference between the latest time each day and the earliest time each day? If so, I think this solution in pandas
should work: you just group by the day and then take the first and last hours and subtract them (note that the start and end times are always 8 and 17 in your data; it would be better to test this with data that actually has a variable answer).
import pandas as pd
df = pd.DataFrame(
[['2018-07-01', '8:00', 'IN'],
['2018-07-01', '12:00', 'OUT'],
['2018-07-01', '12:30', 'IN'],
['2018-07-01', '17:00', 'OUT'],
['2018-07-02', '8:00', 'IN'],
['2018-07-02', '12:00', 'OUT'],
['2018-07-02', '12:30', 'IN'],
['2018-07-02', '17:00', 'OUT'],
['2018-07-03', '8:00', 'IN'],
['2018-07-03', '12:00', 'OUT'],
['2018-07-03', '12:30', 'IN'],
['2018-07-03', '17:00', 'OUT'],
['2018-07-04', '8:00', 'IN'],
['2018-07-04', '17:00', 'OUT']],
columns=['date', 'hour', 'in_out']
)
df = df.drop(columns=['in_out']) # don't need this
df.hour = pd.to_datetime(df.hour)
grouped_hours = df.groupby('date').hour
start_time = grouped_hours.apply(lambda group: group.sort_values().iloc[0])
end_time = grouped_hours.apply(lambda group: group.sort_values().iloc[-1])
end_time - start_time
Upvotes: 0
Reputation: 316
It seems that the start time is always appears earliest and the end time always appears latest. This is what you could do (mind the not so correct syntax because it's been a while since I programmed in python but you should get the general idea)
i = 0
while i < len(list):
j = list[i][0]
time = list[i][2]
i = 0
for k in range(i, len(list)):
if j == list[i+1][0]:
i = i + 1
else:
time = list[i][2] - time #make sure your syntax here is correct
i = i + 1
I have not thought this out thorough but I think it should work, otherwise someone will correct me :)
Upvotes: 0