user3325980
user3325980

Reputation: 15

Regex help to match groups

I am trying to write a regex for matching a text file that has multiple lines such as :

* 964      0050.56aa.3480    dynamic   200        F    F  Veth1379
* 930      0025.b52a.dd7e    static    0          F    F  Veth1469

My intention is to match the "0050.56aa.3480 " and "Veth1379" and put them in group(1) & group(2) for using later on.

The regex I wrote is :

\*\s*\d{1,}\s*(\d{1,}\.(?:[a-z][a-z]*[0-9]+[a-z0-9]*)\.\d{1,})\s*(?:[a-z][a-z]+)\s*\d{1,}\s*.\s*.\s*((?:[a-z][a-z]*[0-9]+[a-z0-9]*))

But it does not seem to be working when I test at: http://www.pythonregex.com/

Could someone point to any obvious error I am doing here.

Thanks, ~Newbie

Upvotes: 1

Views: 81

Answers (4)

Pedro Lobito
Pedro Lobito

Reputation: 98921

This will do it:

reobj = re.compile(r"^.*?([\w]{4}\.[\w]{4}\.[\w]{4}).*?([\w]+)$", re.IGNORECASE | re.MULTILINE)
match = reobj.search(subject)
if match:
    group1 = match.group(1)
    group2 = match.group(2)
else:
    result = ""

Upvotes: 0

Attila O.
Attila O.

Reputation: 16615

A very strict version would look something like this:

^\*\s+\d{3}\s+(\d{4}(?:\.[0-9a-f]{4}){2})\s+\w+\s+\d+\s+\w\s+\w\s+([0-9A-Za-z]+)$

Regular expression visualization

Debuggex Demo

Here I assume that:

  • the columns will be pretty much the same,
  • your first match group contains a group of decimal digits and two groups of lower-case hex digits,
  • and the last word can be anything.

A few notes:

  • \d+ is equivalent to \d{1,} or [0-9]{1,}, but reads better (imo)
  • use \. to match a literal ., as . would simply match anything
  • [a-z]{2} is equivalent to [a-z][a-z], but reads better (my opinion, again)
  • however, you might want to use \w instead to match a word character

Upvotes: 0

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

I don't think you need a regex for this:

for line in open('myfile','r').readlines():
    fields = line.split( )
    print "\n" + fields[1] + "\n" +fields[6]   

Upvotes: 2

aliteralmind
aliteralmind

Reputation: 20163

Try this:

^\* [0-9]{3} +([0-9]{4}.[0-9a-z]{4}.[0-9a-z]{4}).*(Veth[0-9]{4})$

Regular expression visualization

Debuggex Demo

The first part is in capture group one, the "Veth" code in capture group two.


Please consider bookmarking the Stack Overflow Regular Expressions FAQ for future reference. There's a list of online testers in the bottom section.

Upvotes: 2

Related Questions