Reputation: 43
I want to use python ( although any language is fine ), to look through a structured text file which looks like this:
========= Weekend of 2016-12-02: ================
Schedule1:
bob@email
Schedule2:
john@email
bob@email
Schedule3:
Terry@email
========= Weekend of 2016-12-09: ================
Schedule1:
jake@email
Schedule2:
mike@email
bob@email
Schedule3:
howard@email
This pattern repeats for the remainder of the year, what I am trying to accomplish is to find any overlapping schedules. So if bob@email is on more than one schedule for that weekend I would like to find and print that. Example:
Overlaps found for:
========= Weekend of 2016-12-02: ================
bob@email is scheduled for schedule1, and schedule2.
Since this is the only overlap, this is the only occurrence that would print, if there were more then they would print in the same format underneath each other . Is there any way to accomplish this?
The code I've found so far allows me to find each weekend and print that, however I'm not sure how to look at the contents in more detail.
import re
def compare():
with open("weekends.txt","r") as fp:
for result in re.findall('Weekend of (.*?):', fp.read(), re.S):
print(result)
This yields
2016-12-02
2016-12-09
Thank you, and please let me know if there are any questions.
Upvotes: 2
Views: 103
Reputation: 103864
You can do something like this with a regex creating a dict of sets:
import re
from collections import Counter
data={}
with open(fn) as f_in:
txt=f_in.read()
for block in re.finditer(r'^=+\s+([^:]+:)\s=+\s+([^=]+)', txt, re.M):
di={}
for sc in re.finditer(r'^(Schedule\s*\d+):\s*([\s\S]+?)(?=(?:^Schedule\s*\d+)|\Z)', block.group(2), re.M):
di[sc.group(1)]=set(sc.group(2).splitlines())
data[block.group(1)]=di
for date, DofS in data.items():
c=Counter()
for s in DofS.values():
c+=Counter(s)
inverted={k:[] for k, v in c.items() if v>1}
if not inverted:
continue
print date
for k in DofS:
for e in DofS[k]:
if e in inverted:
inverted[e].append(k)
print "\t",inverted
Prints:
Weekend of 2016-12-02:
{'bob@email': ['Schedule1', 'Schedule2']}
Upvotes: 1
Reputation: 150
I think you can use a map to store <name, list of schedule>
, like <bob@email, [Schedule1]>
, when you go through each weekend. Everytime, you want to add a new item, you can check whether the key has been set already. If yes, add that schedule to that corresponding list. If no, add a new item to that map. Then, when you print out, only print the item with more than 1 schedule in the list.
For Python, you can use dictionary as the map. https://www.tutorialspoint.com/python/python_dictionary.htm
Upvotes: 0