Reputation: 1689

How do I regex part of url

I need help solving regex (PCRE). I want to extract the hello-world part from multiple url below. I got this so far:

^/news/(.*?)/$

https://www.example.com/news/2017-08-09/hello-world/topics/

https://www.example.com/news/2017-08-09/hello-world/gallery/

https://www.example.com/news/2017-08-09/hello-world/

But this captures 2017-08-09/hello-world/topics and I only need hello-world

Upvotes: 1

Answers (3)

anubhava

Reputation: 785481

You can use this regex in PCRE:

~/news/[^/]*/\K[^/]+~

/news/[^/]*/: Match /news/ followed by zero or more non-/ followed by /
\K: Forego matched information
[^/]+: Match one or more non-/ characters

RegEx Demo

You may also use a capturing group:

/news/[^/]*/([^/]+)

and extract capturing group #2

RegEx Demo 2

Upvotes: 2

anvita surapaneni

Reputation: 369

[0-9]{4}-[0-9]{2}-[0-9]{2}/(.*?)/ the group 1 has hello world

https://regex101.com/r/wFM7nc/1

Upvotes: 0

Killer Death

Reputation: 459

IF hello-world represents an unknown text and the rest is fixed, try this:

^/news/2017-08-09/(.*?)/.*$

If date is not fixed, you can specify format it is in and use that instead, for example \d{4}-\d{2}-\d{2} or whatever you need.

Upvotes: 0

How do I regex part of url

Answers (3)

Related Questions