D3m3t05
D3m3t05

Reputation: 1

REGEX - Capture everything exept the sentence who start with a "["

I try since 2 day to write an Regex who capture some information from my postmaster digest.

Exemple:

0.32768:0A006832, 4.33024:DD040000 [Stage: CreateMessage]Final-Recipient: rfc822;[email protected]: failedStatus: 5.2.2Diagnostic-Code: smtp;554 5.2.2 mailbox full;

I want to capture sentence like that:

BUT i dont want to capture

I wrote a regex who work perfectly fine for capturing :

([A-Z]{1}[a-z]+\-)?[A-Z]{1,3}[a-z]*\:\

But sadly i dont know how to says to my regex to NOT capturing sentences that start with a "["

i tried this :

[^\[]([A-Z]{1}[a-z]+\-)?[A-Z]{1,3}[a-z]*\:\

This avoid capturing "[Stage:" but capture one caracters before each other captured sentences.

Anyone know how to capture my postmaster errors ?

Thanks in advance.

(NB: Edited i removed "failedStatus:" and replaced by "Status: ")

Upvotes: 0

Views: 93

Answers (3)

D3m3t05
D3m3t05

Reputation: 1

My bad! I made a mistake in my original question!

I want to capture these fields:

Final-Recipient:
-Action:
-Status:
-Diagnostic-Code:
Remote-MTA:

But not this ONE :
-[Stage: ...

So the regex from ghazal khaki is correct and works fine!

Again thanks for your support guys!

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163457

Using sed, you can use capture groups for the first part that matches any character except ] and another group for the whole last part including the optional capture group inside.

Use those in the replacement with a newline between group 1 and group 2 \1\n\2

Note that your pattern would not match failedStatus: as it does not start with a capital letter.

Also you can omit this quantifier {1} as 1 is the default, and you don't have to escape \- and \: and \

sed -E 's/([^\[])(([A-Z][a-z]+-)?[A-Z]{1,3}[a-z]*: )/\1\n\2/g' File.eml

Output

0.32768:0A006832, 4.33024:DD040000 [Stage: CreateMessage]
Final-Recipient: rfc822;[email protected]
Action: failed
Status: 5.2.2

Upvotes: 0

ghazal khaki
ghazal khaki

Reputation: 634

Add (?<!(\[)) before your first regex. the final result would be what you want.

complete answer: (?<!(\[))([A-Z]{1}[a-z]+\-)?[A-Z]{1,3}[a-z]*\:\

explanation: You want to prevent having [ element before your phrase which in regex would be (\[) and you want to don't have it before phrase which means you want to use not equal lookBehind. in regex ?< is lookBehind and ! is not. so what you need is ?<!(\[)

Upvotes: 0

Related Questions