Reputation: 3890
I have the following regex:
~\[(.*)\] (.*): (.*)~s
The desired behavior is to capture the text between [
and ]
(the first occurrence of both). So in this case:
[7/25/2015 8:40:18 PM] Ghost: [Saturday, July 25, 2015 8:13 PM] Nathan:
<<< Quoted text
7/25/2015 8:40:18 PM
should be captured. However, as you can see in the regex101 example, the captured text is 7/25/2015 8:40:18 PM] Ghost: [Saturday, July 25, 2015 8:13 PM
.
I have no idea how this is happening. Any help is appreciated! Thanks!
Upvotes: 0
Views: 71
Reputation: 626689
To capture the first occurrence of text inside [...]
can be achieved with a much more simplified regex:
\[([^]]*)]
See demo
Judging by the sample data, there cannot be any nested [...]
sequences, and there should be no stray ]
inside the square brackets. Thus, a negated character class looks best here.
Here is what the regex means:
\[
- match literal [
([^]]*)
- match and capture into Group 1 0 or more characters other than ]
(note we do not have to escape ]
inside a character class at the beginning of range)]
- matches a literal ]
(note again that this closing square bracket is unambiguous since there is an escaped first [
before it).This will match the first occurrence without g
option, and you can get this behavior using appropriate functions/methods of your programming language.
If you need to match this first occurrence in the beginning of a string/line, use an anchor ^
(to enforce multiline mode you will need /m
modifier):
^\[([^]*&^]*)]
See another demo
Upvotes: 2
Reputation: 15284
This will extract the values
Sample use
7/25/2015 8:40:18 PM Ghost: Saturday, July 25, 2015 8:13 PM Nathan:
With this
(\d+\/\d+\/+\d+ \d+\:\d+\:\d+ [A|P]M)[^:]*: ([A-Z][a-z]+\, [A-Z][a-z]* \d+, \d+ \d+:\d+ [A|P]M)
Upvotes: 0
Reputation: 334
You need to make your .*
non greedy to stop at the first match:
\[(.*?)\] (.*?): (.*)
Upvotes: 0