Merc
Merc

Reputation: 4570

Regex: Replace last segment of url

I try to figure out the correct regex to replace the last segment of an url with a modified version of that very last segment. (I know that there are similar threads out there, but none seemed to help...)

Example:

https://www.test.com/one/two/three/mypost/
--->
one/two/three?id=mypost


https://www.test.com/one/mypost/
--->
one?id=mypost

Now I am stuck here: https://regex101.com/r/9GqYaU/1

I can get the last segment in capturing group 2 but how would I replace it? I think I will have to something like this:

  const url = 'https://www.test.com/one/two/three/mypost/'
  const regex = /(http[s]?:\/\/)([^\/]+\/)*(?=\/$|$)/
  const path = url.replace(regex, `${myUrlWithoutTheLastSegmentAnd WithoutHTTPS}?id=$2`)
  return path

But I have no idea how to get the url without the last segment. I have currently only access to the whole string or group 1 (which is useless in this case) and then group 2, but not the string without group 2.

I would be very glad for any help here. Sometimes I just lack the knowledge of what is possible with regex and how to achieve it.

Thank you in advance.

Cheers

Upvotes: 1

Views: 1172

Answers (2)

Djave
Djave

Reputation: 9349

I came across your question yesterday and agree with going down the route of parsing the URL. Once you get there you could even use JavaScript array methods which I prefer to string methods like:

pathname.split("/").filter(p => p.length).pop()

This would separate each folder, ignore any with no length (i.e. handle a trailing slash) and return the last one (mypost).

Anyway, I am also learning regex so sometimes when I find a question like this I just try to find the answer anyway as the best way of learning is doing. It took 24 hours 😂 I came up with this:

/(https?:\/\/).+?([a-z-]*)\/?$/gm

(https?:\/\/) you know what this does. Small correction, you don't need the square brackets. Question mark matches 0 or 1 of the preceding character. As we're only matching s this just works. If you wanted to match s or z you would use [sz]?. I think.

.+? this is the cool one I think I will use in future now I found it. The question mark here has a different meaning - it makes .+ (which means one or more of any character) non-greedy. That means it stops applying once it reaches the next rule. Which is...

([a-z-]*) any number of letters or a hyphen. You should maybe change this to include numbers and upper case.

\/? Optional slash

$ all this must apply at the end of the string.

Here is a demo https://regex101.com/r/mQNkIS/1

Upvotes: 1

Jean Will
Jean Will

Reputation: 553

You could use the URL class to extract the pathname and substring to remove the first '/'.

Then, you could put the last part of the pathname in a group and use it as a reference $1 for the replacement.

const url = new URL('https://www.test.com/one/two/three/mypost/').pathname.substring(1)

console.log(url.replace(/\/([^/]*)\/$/, '?id=$1'))

Upvotes: 1

Related Questions