Cy's
Cy's

Reputation: 63

how can lookbehind positive work in pgsql regexp_replace?

The most simple test I've done is this one:

SELECT regexp_replace((09,09,41,41,42,42,49,49,49,49,200,200,400,400,500,500), '(?<=,|^)([^,]*)(,\\1)+(?=,|$)', '\\1') AS lignes

I would like to have what regex101 show me : 09,41,42,49,200,400,500. But the entire string come. Any help?

Upvotes: 1

Views: 85

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627292

First of all, you do not need to double the backslashes unless standard_conforming_strings is on:

standard_conforming_strings (boolean)

This controls whether ordinary string literals ('...') treat backslashes literally, as specified in the SQL standard. Beginning in PostgreSQL 9.1, the default is on (prior releases defaulted to off). Applications can check this parameter to determine how string literals will be processed. The presence of this parameter can also be taken as an indication that the escape string syntax (E'...') is supported. Escape string syntax (Section 4.1.2.2) should be used if an application desires backslashes to be treated as escape characters.

Next, you need to use g global modifier to replace all matches, see the 9.7. Pattern Matching section:

Flag i specifies case-insensitive matching, while flag g specifies replacement of each matching substring rather than only the first one.

Use

SELECT regexp_replace('09,09,41,41,42,42,49,49,49,49,200,200,400,400,500,500', '(?<=,|^)([^,]*)(,\1)+(?=,|$)', '\1', 'g') AS lignes

enter image description here

See the online demo.

If you want to make the pattern more efficient, use double negation:

'(?<![^,])([^,]*)(,\1)+(?![^,])'

Upvotes: 1

Related Questions