Andrei Gătej
Andrei Gătej

Reputation: 11944

Regex: How to return only one occurrence of a match even if there are multiple same occurrences?

I'm trying to get the links from a web page, but there could be multiple occurrences of the same link.
Because I'm interested in getting all the links, I'm using the match() function which returns the same link twice(or multiple times, depending on how many identical links are on the page).

Example:

const results = [
    'http://example1.com','http://example1.com', 'http://example2.com','http://example2.com',
];

One solution would be to pass the array of matches to Set().

const expected = [... new Set(results)];
expected // ["http://example1.com", "http://example2.com"]

Is there another way to get the expected result without making use of Set(), preferably still using regex ?

So the main problem is not removing duplicates from the array, but getting distinct values from regex.

Here is some context

Following the example, the result consists of an array of 4 items, namely 2 duplicates.

The expected result would be an array of distinct links. In this case, an array of 2 items.

Upvotes: 1

Views: 393

Answers (1)

vsemozhebuty
vsemozhebuty

Reputation: 13792

You can try to get only the last match by the lookahead assertion:

/(https:\/\/\S+\/[a-z-0-9\?=]+-+\d+-+)(?!.*\1)/gs

Upvotes: 3

Related Questions