Reputation: 11944
I'm trying to get the links from a web page, but there could be multiple occurrences of the same link.
Because I'm interested in getting all the links, I'm using the match()
function which returns the same link twice(or multiple times, depending on how many identical links are on the page).
Example:
const results = [
'http://example1.com','http://example1.com', 'http://example2.com','http://example2.com',
];
One solution would be to pass the array of matches to Set()
.
const expected = [... new Set(results)];
expected // ["http://example1.com", "http://example2.com"]
Is there another way to get the expected result without making use of Set()
, preferably still using regex ?
So the main problem is not removing duplicates from the array, but getting distinct values from regex.
Following the example, the result consists of an array of 4 items, namely 2 duplicates.
The expected result would be an array of distinct links. In this case, an array of 2 items.
Upvotes: 1
Views: 393
Reputation: 13792
You can try to get only the last match by the lookahead assertion:
/(https:\/\/\S+\/[a-z-0-9\?=]+-+\d+-+)(?!.*\1)/gs
Upvotes: 3