Reputation: 526
Here I have a dataset with date ,time and one inputs. Here I want to read date and time together for specific values. Here I want to keep the length of csv file as same without changing.
Here 5 values contain time convert 00:00:00.
Here I used some code but it gave me with 0 days. First I convert that specific time into 00:00:00
data['date_time']= pd.to_datetime(data['date'] + " " + data['time'],
format='%d/%m/%Y %H:%M:%S', dayfirst=True)
data['duration'] = np.where(data['X3'].eq(5), np.timedelta64(0), pd.to_timedelta(data['date_time']))
print(data['duration'])
def f(x):
ts = x.total_seconds()
hours, remainder = divmod(ts, 3600)
minutes, seconds = divmod(remainder, 60)
return ('{:02d}:{:02d}:{:02d}').format(int(hours), int(minutes), int(seconds))
data['duration'] = data['duration'].apply(f)
output :
5 00:00:00
11 00:00:00
18 00:00:00
25 00:00:00
30 00:00:00
37 00:00:00
43 00:00:00
46 00:00:00
54 00:00:00
60 00:00:00
65 00:00:00
70 00:00:00
80 00:00:00
82 00:00:00
89 00:00:00
95 00:00:00
99 00:00:00
104 00:00:00
111 00:00:00
114 00:00:00
121 00:00:00
But what I expected output is:
datetime x3
10/3/2018 6:15:00 7
10/3/2018 00:00:00 5
10/3/2018 7:45:00 7
10/3/2018 9:00:00 7
10/3/2018 9:25:00 7
10/3/2018 00:00:00 5
10/3/2018 11:00:00 7
10/3/2018 11:30:00 7
10/3/2018 13:30:00 7
10/3/2018 00:00:00 5
10/3/2018 15:00:00 7
10/3/2018 15:25:00 7
10/3/2018 16:25:00 7
10/3/2018 00:00:00 5
10/3/2018 19:00:00 7
10/3/2018 19:30:00 7
Means replace time with 00:00:00
date time x3 T x3
10/3/2018 6:15:00 7 10/3/2018 6:15:00 7
10/3/2018 6:45:00 5 10/3/2018 0:00:00 5
10/3/2018 7:45:00 7 10/3/2018 7:45:00 7
10/3/2018 9:00:00 7 10/3/2018 9:00:00 7
10/3/2018 9:25:00 7 10/3/2018 9:25:00 7
10/3/2018 9:30:00 5 10/3/2018 0:00:00 5
Subset of my csv:
date time x3
10/3/2018 6:15:00 7
10/3/2018 6:45:00 5
10/3/2018 7:45:00 7
10/3/2018 9:00:00 7
10/3/2018 9:25:00 7
10/3/2018 9:30:00 5
10/3/2018 11:00:00 7
10/3/2018 11:30:00 7
10/3/2018 13:30:00 7
10/3/2018 13:50:00 5
10/3/2018 15:00:00 7
10/3/2018 15:25:00 7
10/3/2018 16:25:00 7
10/3/2018 18:00:00 5
10/3/2018 19:00:00 7
10/3/2018 19:30:00 7
10/3/2018 20:00:00 7
10/3/2018 22:05:00 7
10/3/2018 22:15:00 5
10/3/2018 23:40:00 7
10/4/2018 6:58:00 5
10/4/2018 13:00:00 7
10/4/2018 16:00:00 7
10/4/2018 17:00:00 7
10/4/2018 18:00:00 7
10/5/2018 7:00:00 7
10/5/2018 8:00:00 7
10/5/2018 9:00:00 7
Upvotes: 1
Views: 150
Reputation: 863741
You can remove times in date_time
column only for rows with data['x3'].eq(5)
with Series.mask
and Series.dt.floor
by days:
data['date_time']= pd.to_datetime(data['date'] + " " + data['time'],
format='%d/%m/%Y %H:%M:%S', dayfirst=True)
mask = data['x3'].eq(5)
data['date_time'] = data['date_time'].mask(mask, data['date_time'].dt.floor('d'))
print (data)
date time x3 date_time
0 10/3/2018 6:15:00 7 2018-03-10 06:15:00
1 10/3/2018 6:45:00 5 2018-03-10 00:00:00
2 10/3/2018 7:45:00 7 2018-03-10 07:45:00
3 10/3/2018 9:00:00 7 2018-03-10 09:00:00
4 10/3/2018 9:25:00 7 2018-03-10 09:25:00
5 10/3/2018 9:30:00 5 2018-03-10 00:00:00
6 10/3/2018 11:00:00 7 2018-03-10 11:00:00
7 10/3/2018 11:30:00 7 2018-03-10 11:30:00
8 10/3/2018 13:30:00 7 2018-03-10 13:30:00
9 10/3/2018 13:50:00 5 2018-03-10 00:00:00
10 10/3/2018 15:00:00 7 2018-03-10 15:00:00
11 10/3/2018 15:25:00 7 2018-03-10 15:25:00
12 10/3/2018 16:25:00 7 2018-03-10 16:25:00
13 10/3/2018 18:00:00 5 2018-03-10 00:00:00
14 10/3/2018 19:00:00 7 2018-03-10 19:00:00
15 10/3/2018 19:30:00 7 2018-03-10 19:30:00
16 10/3/2018 20:00:00 7 2018-03-10 20:00:00
17 10/3/2018 22:05:00 7 2018-03-10 22:05:00
18 10/3/2018 22:15:00 5 2018-03-10 00:00:00
19 10/3/2018 23:40:00 7 2018-03-10 23:40:00
20 10/4/2018 6:58:00 5 2018-04-10 00:00:00
21 10/4/2018 13:00:00 7 2018-04-10 13:00:00
22 10/4/2018 16:00:00 7 2018-04-10 16:00:00
23 10/4/2018 17:00:00 7 2018-04-10 17:00:00
24 10/4/2018 18:00:00 7 2018-04-10 18:00:00
25 10/5/2018 7:00:00 7 2018-05-10 07:00:00
26 10/5/2018 8:00:00 7 2018-05-10 08:00:00
27 10/5/2018 9:00:00 7 2018-05-10 09:00:00
Upvotes: 1
Reputation: 8033
Remove duration
from your code for T
. it should be
T = data.loc[data['duration'] == match_time, ['date','duration','x3']]
When you put duration
in there, what you are asking Pandas is to get the column duration
when data['duration'] == match_time
.
Upvotes: 0