Sven S
Sven S

Reputation: 103

Conditional Regexp: return only one group

Two types of URLs I want to match:

(1) www.test.de/type1/12345/this-is-a-title.html
(2) www.test.de/category/another-title-oh-yes.html

In the first type, I want to match "12345". In the second type I want to match "category/another-title-oh-yes".

Here is what I came up with:

(?:(?:\.de\/type1\/([\d]*)\/)|\.de\/([\S]+)\.html)

This returns the following:

For type (1):

Match group 1: 12345
Match group 2: 

For type (2):

Match group: 
Match group 2: category/another-title-oh-yes

As you can see, it is working pretty well already. For various reasons I need the regex to return only one match-group, though. Is there a way to achieve that?

Upvotes: 10

Views: 31174

Answers (2)

Braj
Braj

Reputation: 46841

Java/PHP/Python

Get both the matched group at index 1 using both Negative Lookahead and Positive Lookbehind.

((?<=\.de\/type1\/)\d+|(?<=\.de\/)(?!type1)[^\.]+)

There are two regex pattern that are ORed.

First regex pattern looks for 12345

Second regex pattern looks for category/another-title-oh-yes.


Note:

  • Each regex pattern must match exactly one match in each URL
  • Combine whole regex pattern inside the parenthesis (...|...) and remove parenthesis from the [^\.]+ and \d+ where:

    [^\.]+   find anything until dot is found
    \d+      find one or more digits
    

Here is online demo on regex101


Input:

www.test.de/type1/12345/this-is-a-title.html
www.test.de/category/another-title-oh-yes.html

Output:

MATCH 1
1.  [18-23] `12345`
MATCH 2
1.  [57-86] `category/another-title-oh-yes`

JavaScript

try this one and get both the matched group at index 2.

((?:\.de\/type1\/)(\d+)|(?:\.de\/)(?!type1)([^\.]+))

Here is online demo on regex101.

Input:

www.test.de/type1/12345/this-is-a-title.html
www.test.de/category/another-title-oh-yes.html

Output:

MATCH 1
1.  `.de/type1/12345`
2.  `12345`
MATCH 2
1.  `.de/category/another-title-oh-yes`
2.  `category/another-title-oh-yes`

Upvotes: 9

Mosho
Mosho

Reputation: 7078

Maybe this:

^www\.test\.de/(type1/(.*)\.|(.*)\.html)$

Regular expression visualization

Debuggex Demo

Then for example:

var str = "www.test.de/type1/12345/this-is-a-title.html"
var regex = /^www\.test\.de/(type1/(.*)\.|(.*)\.html)$/
console.log(str.match(regex))

This will output an array, the first element is the string, the second one is whatever is after the website address, the third is what matched according to type1 and the fourth element is the rest.

You can do something like var matches = str.match(regex); return matches[2] || matches[3];

Upvotes: 1

Related Questions