Reputation: 16131
I have gotten a java regex representing "end of string or space" to work using a capture group ($|\s). However, this seems like a hack because I'm not trying to capture anything. Shouldn't I be using a set of square brackets to indicate a set/character class? Is there something better I should be using?
Extraneous details below:
My actual goal is to grab the http port from this string:
2019-11-14 23:58:12.321 INFO 55572 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 51447/http
This line in the log may also come in the form of:
2019-11-14 23:58:12.321 INFO 55572 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 51447/http 51448/https
So I need to match "http" exactly and not "https" and specify "http" followed by a whitespace (so it can't be https) or "http" followed by the end of the line.
So my java code is:
(\\d+)/http($|\\s)
Upvotes: 2
Views: 149
Reputation: 2436
Your pattern is matching end of line ($
) or space(\\s
) also,
Use look ahead (?=
) to check for space or end of line instead
(\\d+)\\/http(?=$|\\s)
This would match what you are looking for, you can use also
:\\s+(\\d+)
Upvotes: 0
Reputation: 10739
If you don't prefer to use the capturing group, you can use positive lookahead, but just check for a word boundary at the end of the "http" term. Lookahead is used in regular expressions when you want to match a term that occurs before a second term, but you don't want to include the second term in your match. As such, consider trying:
\\d+(?=/http\\b)
Here, only the digits are matched. The (?=
term is the positive lookahead term. Note that it won't capture "/http" and include it in your match. But, it will only match the digits if the digits are suffixed with "/http". The \\b
term ensures that only "http" that exists as an independent word will be matched. Thus, "https" won't be matched, but "http" that has a space after it or a newline or just the end of input will be matched. Hopefully, that helps.
Upvotes: 2
Reputation: 384
You can use this to match specific words in a string
.*\\bhttp\\b.*
in java
String matcher="2019-11-14 23:58:12.321 INFO 55572 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 51447/http 51448/https";
System.out.println(matcher.matches(".*\\bhttp\\b.*")); //returns true
String matcher="2019-11-14 23:58:12.321 INFO 55572 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 51447 51448/https"; // removed http to test
System.out.println(matcher.matches(".*\\bhttp\\b.*")); // returns false
Upvotes: 1
Reputation: 521239
Use a word boundary:
\b(\d+)/http\b
This will prevent https
matches but would also match at the very end of the string.
Upvotes: 2