Reputation: 73231
I have the following emails I need to extract a part from:
Bestellnummer xxx:
1 von xxx, 1er Pack
------------- Anfang der Nachricht -------------
foo bar baz foo bar baz // <<<<< I need this text here
------------- Ende der Nachricht -------------
------------- Anfang der Nachricht -------------
foo bar baz foo bar baz
------------- Ende der Nachricht -------------
There are 0 to unlimited occurences of
------------- Anfang der Nachricht -------------
------------- Ende der Nachricht -------------
and I'm able to extract the first part with this regex:
$re = "/------------- .*? -------------.?(.*?).?------------- .*? -------------/s";
But, as I'm quite new on learning regex, I'm pretty sure there must be a better regex to extract this part (foo bar baz foo bar baz) of the text between
------------- Anfang der Nachricht -------------
------------- Ende der Nachricht -------------
As this can be in different languages, I'm using
.?
To match everything between those hyphens.
I need the first occurence of this text no matter how many occurences there are. Is there are more solid solution for this regex?
Here's a
Upvotes: 2
Views: 86
Reputation: 41747
I ended up with: $regexp = '/Nachricht\s-+\s+(.*?)\s+-+\sEnde/s';
So, it saves a few matching steps and does a bit of trimming on the message.
More solid regexp.. it just works. Write a test to be on the safe side.
\s
- matches space-+
- matches one or more -
chars\s+
- matches one or more spaces; before/after the message to trim the message(.*?)
- for the messageUpvotes: 2