aldegalan
aldegalan

Reputation: 502

Take elements that are in both strings, then compare

I have two strings:

machine1 19968MB 15375MB 23%                    
machine2 79872MB 61501MB 23%                    
machine3 798720MB 615014MB 23% 
machine1 9968MB 15375MB 13%                    
machine2 19872MB 61501MB 33%                    
machine4 798720MB 615014MB 23% 

An I would like to compare all the machines that are present in both strings, to do it, I am doing this:

pat = 'machine_\S+'
machines1 = re.findall(pat, string1)
machines2 = re.findall(pat, string2)
intersect = set(machines1) & set(machines2)
newstring1 = '\n'.join(line for line in string1.splitlines() if
                       re.search(pat, line).group() in intersect)
newstring2 = '\n'.join(line for line in string2.splitlines() if
                       re.search(pat, line).group() in intersect)

Newstring1 should be like this:

machine1 19968MB 15375MB 23%                    
machine2 79872MB 61501MB 23% 

And Newstring2 this:

machine1 9968MB 15375MB 13%                    
machine2 19872MB 61501MB 33% 

But the problem is, that sometimes, the name of those machines could change to another format, and a regex could not do the trick..

Examples of that other format (It could be any format, s i think that a regex is not a solution):

test_volume1 19968MB 15375MB 23% 
testing_nfs 19968MB 15375MB 23% 

Is there any way to do this, but not using regex?

Upvotes: 0

Views: 44

Answers (2)

Valentin C
Valentin C

Reputation: 184

what you could do is getting the first word of each line :

machines1 = [line.split()[0] for line in string1.splitlines()]
machines2 = [line.split()[0] for line in string2.splitlines()]

if the words are space separated this should do the trick, otherwise, you can precise the separator in the .split()

Upvotes: 1

Romain P.
Romain P.

Reputation: 119

If your machine names are always at the begininning of the line, you could use line.split(" ")[0] to get the machine name.

machines1 = [line.split(" ")[0] for line in string1.splitlines()]
machines2 = [line.split(" ")[0] for line in string2.splitlines()]
intersect = set(machines1) & set(machines2)
newstring1 = '\n'.join(line for line in string1.splitlines() if
                       line.split(" ")[0] in intersect)
newstring2 = '\n'.join(line for line in string2.splitlines() if
                       line.split(" ")[0] in intersect)

Upvotes: 1

Related Questions