Regex Postgresql: Remove the last part of string recursively

Question

I have a string that needs to be trimmed through removing words from the end of the string. For instance, if the words ltee., company, co., or any number appear then they should be removed and it should keep removing if the "new" end of the string is still in the list.

Suppose the string PUBLIC SCHOOL 2012 LTEE.. First, the word LTEE should be removed to get PUBLIC SCHOOL 2012 and then the number 2012 should be removed because it is a number at the end.

I attempted to get this behaviour through the following:

 select upper(regexp_replace(lower('PUBLIC SCHOOL 2012 LTEE.'),
                            '\d*$|\s(ltee|ltee.|company|co|co.|liability)$',
                            '','g'))

However, I am just getting PUBLIC SCHOOL 2012 with the number at the end.

Is there a way to remove the number as well? How about if there is another word besides LTEE.?

For instance, the string PUBLIC SCHOOL 2012 LTEE. CO should become PUBLIC SCHOOL removing CO, then LTEE. and finally 2012.

Thanks

Nick · Accepted Answer

You can use a regex with a lookahead to match all the words (or string of digits) you want to replace at once; the lookahead asserts that between the word and the end of the line are only more of those words or strings of digits:

\s(\d*|ltee\.?|company|co\.?|liability)(?=(?:\s(?:\d*|ltee\.?|company|co\.?|liability))*$)

Sample PostgreSQL usage:

select upper(regexp_replace('PUBLIC SCHOOL 2012 LTEE. CO',
                            '\s(\d*|ltee\.?|company|co\.?|liability)(?=(?:\s(?:\d*|ltee\.?|company|co\.?|liability))*$)',
                            '','gi'))

Output:

PUBLIC SCHOOL

Note that you can supply the i flag to regexp_replace to make it case-insensitive, then you don't need to call lower.

Demo on dbfiddle

Regex Postgresql: Remove the last part of string recursively

Answers (1)

Related Questions