Reputation: 3
I used Rubular to verify my regex:
(\d+)\:+(\d+)+\:+(\d+)+\:+(\d+)
it works perfectly fine there for following string
1::594::5::838984679
but the same doesnt work in hive:
create external table ratings8 (userid string, movieid string, rating string, timestamp string) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" = "(\d+)\:+(\d+)+\:+(\d+)+\:+(\d+)", "output.format.string" = "%1$s %2$s %3$s %4$s" ) LOCATION '/ratings';
can somebody help me out here ? what am I doing wrong ?
Upvotes: 0
Views: 414
Reputation: 191729
You need to escape the backslash in the string (\\
). The colon does not require an escape, though
(\\d+):+(\\d+)+:+(\\d+)+:+(\\d+)
It's also not necessary to have (\\d+)+
since this is equivalent to just \\d+
.
Upvotes: 1