Sravan Kumar
Sravan Kumar

Reputation: 291

Confusion with regex matching "question marks"

I have few URLs as given below:

www.xyz.com/search/example?x=123
www.xyz.com/search/example

I want to retrieve "the string between last slash and question mark (if exists)" i.e., for the above examples, I want to retrieve "example". For this I used the below regex, but its not working. Could someone please explain me why its not working. I checked in "https://regex101.com/" for explanation, but their explanation seems to match with what I am thinking, but that isn't how it is working. There is some issue in matching question marks, "\?*" is not working to match one or more question marks.

.*\/(.*?)\?*.*

FYI, I am able to write that following regex to work with my use case:

.*\/((?:[^?])*)

My doubt is why the below regex is not working:

.*\/(.*?)\?*.*

Upvotes: 0

Views: 2040

Answers (2)

Gerald Mücke
Gerald Mücke

Reputation: 11132

You may use this regex:

.*\/([^\?]+)

which matches "all non-question-mark characters in a string with minimum length 1". Important to escape the ? (\?) because it is a reserved character for 0 or 1. Be aware, that this expects at least one character after the last / (i.e. www.xyz.com/search/example/? would result in example/). If that should be avoided, replace the + with * match all-length string not containing a ? :

.*\/([^\?]*)

Regarding your question, why the .*\/(.*?)\?*.* does not work. The (.*?) part matches all strings of any character (.) of any length (including 0-length) that is there or not (?), so basically it matches the empty string. The tailing part \?*.* matches all strings that may or may not start with a arbitrary number of ?, so it basically matches all strings and is equivalent to .*

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174756

Use positive lookahead based regex.

\/([^\/?]*)(?=[^\/]*$)

DEMO

or

(?<=\/)[^\/?]*(?=[^\/]*$)

or

.*\/(.*?)(?:\?|$)

Your last regex won't work because \?* matches zero or more ?. Make your regex to match ? if exists or make it match until the line end. (?:\?|$) expects a ? or line end next to the match.

Upvotes: 1

Related Questions