nelac123
nelac123

Reputation: 91

string after forward slash

The pattern that I have so far using regex

Pattern regex = Pattern.compile("^.*?\/\/([^:\/\s]+)(.*(?=\?|\#))", Pattern.DOTALL);

While working on the string https://url.spec.whatwg.org/#url-syntax, it successfully grabs just the / as I am trying to avoid ? and #, however the problem arises when I try https://url.spec.whatwg.org/

The whitespace at the end is preventing it from finding / in group 2. I have tried including \p{Blank} in the lookahead, however it did nothing.

"https://www.google.com/search?q=Regular+Expressions&num=1000"

Same for the string above; it grabs the /search before the ? but as soon as there as I try "https://www.google.com/search" it breaks down.

How can I fix this?

Thank you for your time!

Upvotes: 2

Views: 173

Answers (1)

Bagus Tesa
Bagus Tesa

Reputation: 1695

The answer below assumes that the input will be URL and we'll take only a bit of it without the query string. Try this

(http)s?:\/\/[^#?]+

You could change the (http)s? with (.+) if you want your old multi-catch approach.. although we could define protocols directly like (http|ftp|...)s?.

Online Test

Upvotes: 2

Related Questions