SystemTheory
SystemTheory

Reputation: 369

Python group dates within close range of each other

I found two references that appear relevant to the problem described below:

http://freshfoo.com/posts/itertools_groupby/

Group together arbitrary date objects that are within a time range of each other

I have a structure of signals sorted in ascending order by date similar to the sample structure in the code below. I adapted the example in the first reference above to group by exact date but not by a date range:

# Python 3.5.2
from itertools import groupby
from operator import itemgetter

signals = [('12-16-1987', 'Q'),
           ('12-16-1987', 'Y'),
           ('12-16-1987', 'P'),
           ('12-17-1987', 'W'),
           ('11-06-1990', 'Q'),
           ('11-12-1990', 'W'),
           ('11-12-1990', 'Y'),
           ('11-12-1990', 'P'),
           ('06-03-1994', 'Q'),
           ('11-20-1997', 'P'),
           ('11-21-1997', 'W'),
           ('11-21-1997', 'Q')]

for key, items in groupby(signals, itemgetter(0)):
    print (key)
    for subitem in items:
        print (subitem)
    print ('-' * 20)

Output:

12-16-1987
('12-16-1987', 'Q')
('12-16-1987', 'Y')
('12-16-1987', 'P')
--------------------
12-17-1987
('12-17-1987', 'W')
--------------------
11-06-1990
('11-06-1990', 'Q')
--------------------
11-12-1990
('11-12-1990', 'W')
('11-12-1990', 'Y')
('11-12-1990', 'P')
--------------------
06-03-1994
('06-03-1994', 'Q')
--------------------
11-20-1997
('11-20-1997', 'P')
--------------------
11-21-1997
('11-21-1997', 'W')
('11-21-1997', 'Q')
--------------------

I would like to group dates by proximity to each other within a window of two, three, or four weeks (not sure which window range to apply yet). The sample data would print as follows.

Desired output:

Group 0
('12-16-1987', 'Q')
('12-16-1987', 'Y')
('12-16-1987', 'P')
('12-17-1987', 'W')
--------------------
Group 1
('11-06-1990', 'Q')
('11-12-1990', 'W')
('11-12-1990', 'Y')
('11-12-1990', 'P')
--------------------
Group 2
('06-03-1994', 'Q')
--------------------
Group 3
('11-20-1997', 'P')
('11-21-1997', 'W')
('11-21-1997', 'Q')
--------------------

At this point not sure how to produce grouped output by date proximity range.

Upvotes: 1

Views: 511

Answers (1)

SystemTheory
SystemTheory

Reputation: 369

I solved my own problem with the following code:

# Python 3.5.2
from datetime import datetime

signals = [('12-16-1987', 'Q'),
           ('12-16-1987', 'Y'),
           ('12-16-1987', 'P'),
           ('12-17-1987', 'W'),
           ('11-06-1990', 'Q'),
           ('11-12-1990', 'W'),
           ('11-12-1990', 'Y'),
           ('11-12-1990', 'P'),
           ('06-03-1994', 'Q'),
           ('11-20-1997', 'P'),
           ('11-21-1997', 'W'),
           ('11-21-1997', 'Q')]

print ()
print ('Signals')
for i, (date, name) in enumerate(signals):
    if i == 0:
        print ()
        print ('{:>3}'.format(i), ' ', date, ' ', name)
        prior_date = date
    elif i > 0:
        d1 = datetime.strptime(prior_date, '%m-%d-%Y')
        d2 = datetime.strptime(date, '%m-%d-%Y')
        days = abs((d2 -d1).days)
        if days > 21:
            print ()
            print ('{:>3}'.format(i), ' ', date, ' ', name)
        elif days <= 21:
            print ('{:>3}'.format(i), ' ', date, ' ', name)
        prior_date = date

Output:

Signals

  0   12-16-1987   Q
  1   12-16-1987   Y
  2   12-16-1987   P
  3   12-17-1987   W

  4   11-06-1990   Q
  5   11-12-1990   W
  6   11-12-1990   Y
  7   11-12-1990   P

  8   06-03-1994   Q

  9   11-20-1997   P
 10   11-21-1997   W
 11   11-21-1997   Q

Upvotes: 1

Related Questions