Reputation: 971
I have two files. I want to get a list of id's for NEW orders that are in Master.txt, but not in Subset.txt. Master.txt also contains existing orders (EXIST), which are not in Subset.txt, so its not a 1:1 comparison of files.
foundCount = 0
notFoundCount = 0
notFoundDict = []
for i, logLine in enumerate(open(master, "r").readlines()):
if len(logLine ) > 1:
if "NEW" in log_line:
newItemDict = dict(item.split(":") for item in newItem.split(","))
id = newItemDict ['id']
for i, subsetLogLine in enumerate(open(subset, "r").readlines()):
if id in subsetLogLine and "NEW" in subsetLogLine:
foundCount += 1
break
else:
notFoundCount += 1
notFoundDict.append(id)
Unfortunately what occurs is it gets unique id in the first line in Master.txt, matches that against a line in Subset.txt, but all the other lines don't have that id, so it adds all those id's to notFoundDict.
So i want it to search all of File B and append that id if not found in the whole file and break if it is found.
Master.txt
{"Type":"NEW","id":201753427,"time":"08:11:57.545","title":"string"}
{"Type":"NEW","id":201753195,"time":"08:11:58.616","title":"string"}
{"Type":"EXIST","id":201753195,"time":"08:11:59.639","title":"string"}
{"Type":"UPDATE","id":201753195,"time":"08:13:57.319","title":"string"}
{"Type":"UPDATE","id":201753195,"time":"08:15:51.119","title":"string"}
{"Type":"NEW","id":201753199,"time":"08:19:13.114","title":"string"}
Subset.txt
{NEWORDID="201753427" ORDTYPE="NEW" ORIGIN="LocationA" USERNAME="..." TIME="08:11:57.645"}
{NEWORDID="201753195" ORDTYPE="NEW" ORIGIN="LocationC" USERNAME="..." TIME="08:11:57.619"}
{NEWORDID="201753199" ORDTYPE="NEW" ORIGIN="LocationC" USERNAME="..." TIME="08:19:13.114"}
Upvotes: 1
Views: 159
Reputation: 475
Have you considered a different approach?
Load all new order ids from file 1 into a set.
Load all new order ids from file 2 into a set.
Then find all the objects in the file 1 set that aren't in the file 2 set.
Seems like a simpler way to tackle your problem unless the files are unusually large.
Upvotes: 1