Reputation: 141
I am struggling to find the regular expression which matches below 2 formats:
cmd1 = "cmd:ifconfig:"PASS":"Fail":4:2"
cmd2 = "cmd:ifconfig:"PASS""
Below is my sample python code
import re
cmd_reg = r'cmd:(.*):\"(.*?)\"$'
result=re.findall(cmd_reg,cmd2)
print(result) # output -> [('ifconfig', 'PASS')] Expectation [('ifconfig', 'PASS', 'FAIL', 4, 2)]
result=re.findall(cmd_reg,cmd1)
print(result) # output -> [] Expectation : [('ifconfig', 'PASS', '','','')]
But I couldn't figure out the regular expression which gives the output as mentioned in Expectation
Upvotes: 1
Views: 76
Reputation: 2553
I would suggest the following pattern:
:(\w*):"?(\w*)"?:?"?(\w*)"?:?"?(\w*)"?:?"?(\w*)"?
You can try the above pattern interactively at the following website:
https://regex101.com/r/M9bf6m/2
Upvotes: 0
Reputation: 545528
Python’ regex package can’t match multiple occurrences of a given group, so this will fundamentally not work with a single regular expression (some other regex implementations do support this, by distinguishing between a match and a capture).
I believe your best bet is to
cmd_pattern = r'^cmd:([^:]+):(.*)$'
group_pattern = r'"?([^:"]+)"?' # or, simpler, r'[^:]+'; to retain quotes.
cmd, groups = re.match(cmd_pattern, cmd1).groups()
parsed_groups = re.findall(group_pattern, groups)
For cmd2
, parsed_groups
will be ['PASS']
, which I think makes more general sense than your desired result. If you need to fill the list with empty elements, you need to do this manually.
As an alternative, you could hard-code the four groups, and make them optional:
cmd_pattern = r'^cmd:([^:]+):([^:]+)(?::([^:]+))?(?::([^:]+))?(?::([^:]+))?'
re.match(cmd_pattern, cmd1).groups()
# ('ifconfig', '"PASS"', '"Fail"', '4', '2')
re.match(cmd_pattern, cmd2).groups()
# ('ifconfig', '"PASS"', None, None, None)
… I don’t recommend this. And this complex expression doesn’t even handle optional quotes yet, which would make it even more complex.
Upvotes: 1
Reputation: 3
cmd1 = 'cmd:ifconfig:"PASS":"":4:'
cmd2 = 'cmd:ifconfig:"PASS"'
import re
cmd_reg = r'cmd:(.*):\"(.*)(:\"\":(\d):)?$'
results =re.findall(cmd_reg,str([cmd1,cmd2])
print(results)
Upvotes: 0
Reputation: 1268
in general there are a lot of ways to implement this. if you will give other examples the regex can be more fit to the general case (and not over-fit to this example).
I searched exactly like you, and tried to search for any digit that between 2 delimiters of :
which comes after 2 times of "
, which all of this extra string is optional)
try this:
cmd1 = 'cmd:ifconfig:"PASS":"":4:'
cmd2 = 'cmd:ifconfig:"PASS"'
import re
cmd_reg = r'cmd:(.*):\"(.*)(:\"\":(\d):)?$'
result=re.findall(cmd_reg,cmd2)
print(result)
#output -> [('ifconfig', 'PASS')]
result=re.findall(cmd_reg,cmd1)
print(result)
#output -> []
output:
[('ifconfig', 'PASS"', '', '')]
[('ifconfig:"PASS"', '":4:', '', '')]
Upvotes: 0