Reputation: 33
I am using below which is working fine but I need to add an expression which will replace anything (like special characters, numbers, etc all things should cover currently I have used [A-Z])
select regexp_replace('002 - KLAMPFER - Sy 52-5-5+L', '- [A-Z]+ -', '- XXXXXXX -')
from dual;
output:
002 - XXXXXXX - Sy 52-5-5+L
but in input string we might get special chars,number,spaces,german special chars instead of "KLAMPFER" sting so I need regex which will cover these all.
Upvotes: 0
Views: 130
Reputation: 11032
As I expect you are finding the Redshift regexp engine is not as fully configurable as you would like. I think this has to do with when it was forked from Postgres but the 'why' doesn't matter. In these case I find that simplifying to be the best approach to getting to success. ("Yes", this lack of regexp full functionality is a pain. Maybe someone knows the magic to get Redshift to do all that we want.)
Here's a KISS approach:
select regexp_substr(txt, '^[^-]*') || regexp_replace(regexp_substr(txt, ' - ([^-]*) - '), '[^\\s-]', 'X') || regexp_replace(txt, '^[^-]* - [^-]* - (.*)$', '$1')
from (
select '002 - KLAMPFER - Sy 52-' as txt
union select '002 - DFTF GTFL - Ty 52-5-5+L'
union select '002 - AMPFER TT OT - Vy 5-5+L'
union select '002 - AMP 3>?> - Ty 52-5-5'
);
Now if your are certain that the delimiter next ' - ' will not show up 3 times in the text (the above code doesn't assume this) then this is simpler (using split_part()):
select split_part(txt, ' - ', 1) || ' - ' ||
regexp_replace(split_part(txt, ' - ', 2), '[^\\s]', 'X') || ' - ' ||
split_part(txt, ' - ', 3)
from (
select '002 - KLAMPFER - Sy 52-' as txt
union select '002 - DFTF GTFL - Ty 52-5-5+L'
union select '002 - AMPFER TT OT - Vy 5-5+L'
union select '002 - AMP 3>?> - Ty 52-5-5'
);
Upvotes: 0
Reputation: 626802
With any regex engine that supports non-fixed with lookbehinds, you can use
(?<=^\S+\s+-\s(?:(?!\s+-\s).)*)\S
See the regex demo. Details:
(?<=^\S+\s+-\s(?:(?! - ).)*)
- a positive lookbehind that matches a location that is immediately preceded with
^
- start of string\S+
- one or more non-whitespace chars\s+-\s
- one or more whitespaces, -
and a whitespace(?:(?!\s+-\s).)*
- any char other than a line break char, as many occurrences as possible, not starting the "one or more whitespaces, -
and a whitespace" sequence\S
- any non-whitespace char.Upvotes: 1