Adam
Adam

Reputation: 71

python regex fixed length fields witch specified characters and substrings

How to capture a lines with fixed length fields including specified characters and a substrings? In this case:

input:

123456781234567812345678123... (char numbers)
RBE3    323123       123121
  RBE3  323123   123    121
RBE3    32312300123     121
RBE3    3231231234      121
$ RBE3  323123123       121
R B E3  32312     123   121
     RBE32312       12313

output would be:

RBE3    323123       123121
  RBE3  323123   123    121
RBE3    32312300123     121

I tried with:

regex = r'^([RBE3\s]{8}.{8}[123\s]{8}.*\n)'

but it seems a completely wrong direction

Upvotes: 0

Views: 499

Answers (1)

Tim Pietzcker
Tim Pietzcker

Reputation: 336438

I would strongly advise to not use a single regex for this. Better chop up your line into chunks of 8, then validate those.

If you insist, it's possible but ugly:

^(\s*RBE3\s*)(?<=^.{8})(.{8})(\s*123\s*)(?<=^.{24})(.*)$

Explanation:

^            # Start of string (or line, if you use multiline mode)
(\s*RBE3\s*) # Match RBE3, surrounded by any amount of whitespace --> group 1
(?<=^.{8})   # Make sure that we have matched 8 characters so far.
(.{8})       # Match any 8 characters --> group 2
(\s*123\s*)  # Match 123, surrounded by any amount of whitespace --> group 3
(?<=^.{24})  # Make sure that we have matched 24 characters so far.
(.*)         # Match the rest of the line/string --> group 4
$            # End of string/line

Test it live on regex101.com. Note that only line 2 and 3 satisfy the requirements you stated.

Upvotes: 2

Related Questions