Reputation: 502
Having two strings:
machine1 665600MB 512512MB 19%
machine2 53248MB 41000MB 20%
machine3 625600MB 522512MB 22%
and:
machine1 665600MB 512512MB 21%
machine2 53248MB 41000MB 22%
machine3 625600MB 522512MB 21%
machine5 53248MB 41000MB 23%
I would like to compare the differences of both, but only for those machines that are the same in both sides (machine1, 2 and 3), avoiding machine5 (that must be for both sides, if something exists in one, but not in the other, it must be ignored).
To compare both strings I use this:
avoid = {x.rstrip() for x in string2.splitlines()}
result = str("\n".join(x for x in string1.splitlines() if x.rstrip() not in avoid))
But it compares all the differences only in one side...
Upvotes: 0
Views: 50
Reputation: 8790
My thought is to use regex to identify the machines in each string and their intersection:
import re
string1 = '''machine1 665600MB 512512MB 19%
machine2 53248MB 41000MB 20%
machine3 625600MB 522512MB 22%'''
string2 = '''machine1 665600MB 512512MB 21%
machine2 53248MB 41000MB 22%
machine3 625600MB 522512MB 21%
machine5 53248MB 41000MB 23%'''
pat = 'machine\d+'
machines1 = re.findall(pat, string1)
machines2 = re.findall(pat, string2)
intersect = set(machines1) & set(machines2)
# {'machine1', 'machine2', 'machine3'}
Then subset based on that intersection, using the same split-and-join that you did above:
newstring1 = '\n'.join(line for line in string1.splitlines() if
re.search(pat, line).group() in intersect)
newstring2 = '\n'.join(line for line in string2.splitlines() if
re.search(pat, line).group() in intersect)
The result is these two new strings:
>>> print(newstring1)
machine1 665600MB 512512MB 19%
machine2 53248MB 41000MB 20%
machine3 625600MB 522512MB 22%
>>> print(newstring2)
machine1 665600MB 512512MB 21%
machine2 53248MB 41000MB 22%
machine3 625600MB 522512MB 21%
How you want to "compare" them is a little vague, but the two new strings should only contain records for the same machines.
Upvotes: 1