Observer
Observer

Reputation: 651

Handling multiple matches in regex in Hive

I want to parse out negative decimal values in a expression in Hive and I have written the following regex,

select regexp_extract("abcsdfghj-117.3700631&poikse-118.244&",
'([-][1-9][0-9]*[.][0-9]+)&*') as output

While the regex seems to work well, it gives me only the first match of it. Is it possible to make hive give out all possible combinations ? Is there any function in hive to make that return all the matches?

I did google this and I was not able to find any answer. Any help would be appreciated

Thanks

Upvotes: 2

Views: 8367

Answers (1)

David דודו Markovitz
David דודו Markovitz

Reputation: 44941

  1. replace every {prefix}{number}& with ,{number}
  2. cut the result from the 2nd char (removing the first ,)
  3. split the result to array by ,

hive> select split(substr(regexp_replace("abcsdfghj-117.3700631&poikse-118.244&",'.*?(-\\d+\\.\\d+)&',',$1'),2),',') as output;
OK
["-117.3700631","-118.244"]

Upvotes: 8

Related Questions