Reputation: 369
I found two references that appear relevant to the problem described below:
http://freshfoo.com/posts/itertools_groupby/
Group together arbitrary date objects that are within a time range of each other
I have a structure of signals sorted in ascending order by date similar to the sample structure in the code below. I adapted the example in the first reference above to group by exact date but not by a date range:
# Python 3.5.2
from itertools import groupby
from operator import itemgetter
signals = [('12-16-1987', 'Q'),
('12-16-1987', 'Y'),
('12-16-1987', 'P'),
('12-17-1987', 'W'),
('11-06-1990', 'Q'),
('11-12-1990', 'W'),
('11-12-1990', 'Y'),
('11-12-1990', 'P'),
('06-03-1994', 'Q'),
('11-20-1997', 'P'),
('11-21-1997', 'W'),
('11-21-1997', 'Q')]
for key, items in groupby(signals, itemgetter(0)):
print (key)
for subitem in items:
print (subitem)
print ('-' * 20)
Output:
12-16-1987
('12-16-1987', 'Q')
('12-16-1987', 'Y')
('12-16-1987', 'P')
--------------------
12-17-1987
('12-17-1987', 'W')
--------------------
11-06-1990
('11-06-1990', 'Q')
--------------------
11-12-1990
('11-12-1990', 'W')
('11-12-1990', 'Y')
('11-12-1990', 'P')
--------------------
06-03-1994
('06-03-1994', 'Q')
--------------------
11-20-1997
('11-20-1997', 'P')
--------------------
11-21-1997
('11-21-1997', 'W')
('11-21-1997', 'Q')
--------------------
I would like to group dates by proximity to each other within a window of two, three, or four weeks (not sure which window range to apply yet). The sample data would print as follows.
Desired output:
Group 0
('12-16-1987', 'Q')
('12-16-1987', 'Y')
('12-16-1987', 'P')
('12-17-1987', 'W')
--------------------
Group 1
('11-06-1990', 'Q')
('11-12-1990', 'W')
('11-12-1990', 'Y')
('11-12-1990', 'P')
--------------------
Group 2
('06-03-1994', 'Q')
--------------------
Group 3
('11-20-1997', 'P')
('11-21-1997', 'W')
('11-21-1997', 'Q')
--------------------
At this point not sure how to produce grouped output by date proximity range.
Upvotes: 1
Views: 511
Reputation: 369
I solved my own problem with the following code:
# Python 3.5.2
from datetime import datetime
signals = [('12-16-1987', 'Q'),
('12-16-1987', 'Y'),
('12-16-1987', 'P'),
('12-17-1987', 'W'),
('11-06-1990', 'Q'),
('11-12-1990', 'W'),
('11-12-1990', 'Y'),
('11-12-1990', 'P'),
('06-03-1994', 'Q'),
('11-20-1997', 'P'),
('11-21-1997', 'W'),
('11-21-1997', 'Q')]
print ()
print ('Signals')
for i, (date, name) in enumerate(signals):
if i == 0:
print ()
print ('{:>3}'.format(i), ' ', date, ' ', name)
prior_date = date
elif i > 0:
d1 = datetime.strptime(prior_date, '%m-%d-%Y')
d2 = datetime.strptime(date, '%m-%d-%Y')
days = abs((d2 -d1).days)
if days > 21:
print ()
print ('{:>3}'.format(i), ' ', date, ' ', name)
elif days <= 21:
print ('{:>3}'.format(i), ' ', date, ' ', name)
prior_date = date
Output:
Signals
0 12-16-1987 Q
1 12-16-1987 Y
2 12-16-1987 P
3 12-17-1987 W
4 11-06-1990 Q
5 11-12-1990 W
6 11-12-1990 Y
7 11-12-1990 P
8 06-03-1994 Q
9 11-20-1997 P
10 11-21-1997 W
11 11-21-1997 Q
Upvotes: 1