Reputation: 107
I have multiple IDs for each row and I want to extract the number that comes after certain text in each row. Below could be my values in id column.
test123; tghy876; 8906; TT-1234
best123; tghy8656; 88706; TT-5678
I want output to return only numbers that comes after tt- so my output should be as below.
1234
5678
what hive code should be used to achieve this?
Upvotes: 3
Views: 10087
Reputation: 107
You should be able to do it using multiple split:
split(split(id,"tt-")[1],";")[0]
Upvotes: 0
Reputation: 38290
Using regexp - return first group of any number of digits after '; TT-'
select regexp_extract(src_string,'\; TT-(\\d+)',1);
Demo:
hive> select regexp_extract('best123\; tghy8656\; 88706\; TT-5678','\; TT-(\\d+)',1);
OK
5678
Time taken: 0.056 seconds, Fetched: 1 row(s)
hive>
Upvotes: 3