Gudala Sandeep
Gudala Sandeep

Reputation: 19

How to replace string in Hive

I have data like below in my hive tables:

FirstName

Mr Tim Cannon
Guest
Miss Kerri Smith
Missra

I'm trying to use

INITCAP(REGEXP_REPLACE('Miss Missra','(Mr.)|^Mr$|^Ms.$|^Ms$|^Mrs.$|^Mrs$|^Dr.$|^Dr$|^Miss$',' '))

but still it is same.

When tried using

INITCAP(REGEXP_REPLACE('Miss Missra','(\\+)|(Mr.)|^Mr$|^Ms.$|^Ms$|^Mrs.$|^Mrs$|^Dr.$|^Dr$|Miss',' '))

then it is replacing 'Missra' as 'Ra'

Output should be:

FirstName

Tim Cannon
Guest
Kerri Smith
Missra

Please suggest me some workaround.

Upvotes: 1

Views: 851

Answers (1)

leftjoin
leftjoin

Reputation: 38325

$ means end of line, and after '^Miss' you have space, not end of line. '^Miss ' should work fine. Also dots (literal .) should be shielded with \\ because dot in regexp has special meaning - any character, and it seems you mean literally dot(.).

^ - means the beginning of the line anchor, BTW first regex Mr\\. has no ^ anchor, fix it if it is not intentionally. If you want to replace these expressions not only in the beginning, remove all ^

REGEXP_REPLACE('Miss Missra','Mr\\.|^Mr |^Ms\\. |^Ms |^Mrs\\. |^Mrs |^Dr\\. |^Dr |^Miss ',' ')

Upvotes: 1

Related Questions