user2532296
user2532296

Reputation: 848

Regex for extracting fields

I am trying to write a regex to match

This is my dump

Ack_ONE............................FAILED
[58] 0
[59] 0
[5A] 0
[5B] 0
dropball.....................................PASSED
nfrock_port@0x44A40000: Error: TX 0x00A9EFB6    
MAKEPIE.....................................FAILED

I am trying to extract the following using match command. So that I can have tests (Ack_ONE,dropball, Makepie) as match.groups()[0] and the results (FAILED,PASSED,FAILED) in match.groups()[1].

Ack_ONE FAILED
dropball PASSED
Makepie FAILED

I am using the following regex command

match = re.search( r'\s*([a-zA-Z_0-9]+)............................(.*?)\n', line)

How can I remove/ignore the .(dots) using regex and improve the above?

Upvotes: 1

Views: 323

Answers (3)

alec_djinn
alec_djinn

Reputation: 10789

This is my solution to the problem.

import re

text = '''
Ack_ONE............................FAILED
[58] 0
[59] 0
[5A] 0
[5B] 0
dropball.....................................PASSED
nfrock_port@0x44A40000: Error: TX 0x00A9EFB6    
MAKEPIE.....................................FAILED
'''

data = text.split('\n')
for item in data:
    if '...' in item:
        print re.findall(r'[^.]+.', item)

It prints out:

['Ack_ONE.', 'FAILED']
['dropball.', 'PASSED']
['MAKEPIE.', 'FAILED']

Upvotes: 0

vks
vks

Reputation: 67968

\s*([a-zA-Z_0-9]+)\.{5,}(.*?)(?:\n|$)

You can use this with re.findall to get your results.See demo.

https://regex101.com/r/nS2lT4/35

import re
p = re.compile(r'\s*([a-zA-Z_0-9]+)\.{5,}(.*?)(?:\n|$)', re.MULTILINE)
test_str = "Ack_ONE............................FAILED\n[58] 0\n[59] 0\n[5A] 0\n[5B] 0\ndropball.....................................PASSED\nnfrock_port@0x44A40000: Error: TX 0x00A9EFB6    \nMAKEPIE.....................................FAILED"

re.findall(p, test_str)

Upvotes: 0

Maroun
Maroun

Reputation: 95958

Note that dots means "any character", so your regex matches for example:

dropball....r34....(...dfsd.....6.....tyu....PASSED

You should escape the . if you want to match the literal dot.

So you can have:

match = re.search( r'\s*(\w+)\.{28}(.*?)\n', line)
  • \w matches any word character

  • \.{28} matches 28 dots (you can expand it to {x,y} to match between x and y dots, or if you don't care about how many dots can appear, you can simply use \.+). If you want to ignore the dots, use \.*.

Upvotes: 1

Related Questions