Reputation: 19
I have data like below in my hive tables:
FirstName
Mr Tim Cannon
Guest
Miss Kerri Smith
Missra
I'm trying to use
INITCAP(REGEXP_REPLACE('Miss Missra','(Mr.)|^Mr$|^Ms.$|^Ms$|^Mrs.$|^Mrs$|^Dr.$|^Dr$|^Miss$',' '))
but still it is same.
When tried using
INITCAP(REGEXP_REPLACE('Miss Missra','(\\+)|(Mr.)|^Mr$|^Ms.$|^Ms$|^Mrs.$|^Mrs$|^Dr.$|^Dr$|Miss',' '))
then it is replacing 'Missra' as 'Ra'
Output should be:
FirstName
Tim Cannon
Guest
Kerri Smith
Missra
Please suggest me some workaround.
Upvotes: 1
Views: 851
Reputation: 38325
$
means end of line, and after '^Miss'
you have space, not end of line. '^Miss '
should work fine. Also dots (literal .
) should be shielded with \\
because dot in regexp has special meaning - any character, and it seems you mean literally dot(.
).
^
- means the beginning of the line anchor, BTW first regex Mr\\.
has no ^
anchor, fix it if it is not intentionally. If you want to replace these expressions not only in the beginning, remove all ^
REGEXP_REPLACE('Miss Missra','Mr\\.|^Mr |^Ms\\. |^Ms |^Mrs\\. |^Mrs |^Dr\\. |^Dr |^Miss ',' ')
Upvotes: 1