Reputation: 1689
I need help solving regex (PCRE). I want to extract the hello-world
part from multiple url below. I got this so far:
^/news/(.*?)/$
https://www.example.com/news/2017-08-09/hello-world/topics/
https://www.example.com/news/2017-08-09/hello-world/gallery/
https://www.example.com/news/2017-08-09/hello-world/
But this captures 2017-08-09/hello-world/topics
and I only need hello-world
Upvotes: 1
Views: 58
Reputation: 785481
You can use this regex in PCRE:
~/news/[^/]*/\K[^/]+~
/news/[^/]*/
: Match /news/
followed by zero or more non-/
followed by /
\K
: Forego matched information[^/]+
: Match one or more non-/
charactersYou may also use a capturing group:
/news/[^/]*/([^/]+)
and extract capturing group #2
Upvotes: 2
Reputation: 369
[0-9]{4}-[0-9]{2}-[0-9]{2}/(.*?)/ the group 1 has hello world
https://regex101.com/r/wFM7nc/1
Upvotes: 0
Reputation: 459
IF hello-world represents an unknown text and the rest is fixed, try this:
^/news/2017-08-09/(.*?)/.*$
If date is not fixed, you can specify format it is in and use that instead, for example \d{4}-\d{2}-\d{2} or whatever you need.
Upvotes: 0