Reputation: 313
I'm stuck trying to capture a structure like this:
1:1 wefeff qwefejä qwefjk
dfjdf 10:2 jdskjdksdjö
12:1 qwe qwe: qwertyå
I would want to match everything between the digits, followed by a colon, followed by another set of digits. So the expected output would be:
match 1 = 1:1 wefeff qwefejä qwefjk dfjdf
match 2 = 10:2 jdskjdksdjö
match 3 = 12:1 qwe qwe: qwertyå
Here's what I have tried:
\d+\:\d+.+
But that fails if there are word characters spanning two lines.
I'm using a javascript based regex engine.
Upvotes: 1
Views: 63
Reputation: 626738
You may use a regex based on a tempered greedy token:
/\d+:\d+(?:(?!\d+:\d)[\s\S])*/g
The \d+:\d+
part will match one or more digits, a colon, one or more digits and (?:(?!\d+:\d)[\s\S])*
will match any char, zero or more occurrences, that do not start a sequence of one or more digits followed with a colon and a digit. See this regex demo.
As the tempered greedy token is a resource consuming construct, you can unroll it into a more efficient pattern like
/\d+:\d+\D*(?:\d(?!\d*:\d)\D*)*/g
See another regex demo.
Now, the ()
is turned into a pattern that matches strings linearly:
\D*
- 0+ non-digit symbols(?:
- start of a non-capturing group matching zero or more sequences of:
\d
- a digit that is...(?!\d*:\d)
- not followed with 0+ digits, :
and a digit\D*
- 0+ non-digit symbols)*
- end of the non-capturing group.Upvotes: 1
Reputation: 440
you can use or not the ñ-Ñ, but you should be ok this way
\d+?:\d+? [a-zñA-ZÑ ]*
Edited:
If you want to include the break lines, you can add the \n or \r to the set,
\d+?:\d+? [a-zñA-ZÑ\n ]*
\d+?:\d+? [a-zñA-ZÑ\r ]*
Give it a try ! also tested in https://regex101.com/
for more chars: ^[a-zA-Z0-9!@#\$%\^\&*)(+=._-]+$
Upvotes: 0