Reputation: 145
I have this string (which is just the cut out part of a larger string):
00777: 50.000 bit/s
and want to capture the 50.000 bit/s part I've created a positive look-behind regex like this:
(?<=\d{5}: )\S+\s+\S+
Which works but when there are more spaces between the : and the number it doesn't - like expected.
So I did this:
(?<=\d{5}:\s+)\S+\s+\S+
But that doesn't work?! Why? Even this expression doesn't match any string:
(?<=\d{0,5}).*
What is it that I'm missing here?
Upvotes: 1
Views: 1949
Reputation: 32797
This is because many regex engines don't support quantifiers(+
,*
,?
) in lookbehind.
Example:java
,javascript
EDIT
Since you are using Java,you can use group
Matcher m=Pattern.compile("\\d{5}:\\s+(\\S+\\s+\\S+)").matcher(input);
if(m.find())
value=m.group(1);
Upvotes: 1
Reputation: 1708
In the first one you can use a variable amount of spaces with (?<=\d{5}: +)
, but like the other answer, it might not be supported by your regex engine.
The last expression doesn't match any string because of the .
on the data, it's not part of the \d
char class, you could use [\d\.]
As a rule of thumb, I always start writing the simplest regex that will do it and I rely on data patterns that I believe will stay.
If you expect the unit to always be after the number you're after, and it will always be bit/s
, there's no reason not to include it as a literal in your regex:
[\d\.]+ bit/s$
Then you can start to turn it into a more complex expression if you find exceptions in your data, like a unit with kbit/s:
(<value>[\d\.]+) *(<unit>\w+)/s$
Using named capture groups so it's easier and more readable to reference them later so can multiply the value by the unit, etc.
In resume: don't use fancier features if you won't really need them.
Upvotes: 0