Shikha Vishwakarma
Shikha Vishwakarma

Reputation: 3

hive regex serde doesnt recognize my regex

I used Rubular to verify my regex:

(\d+)\:+(\d+)+\:+(\d+)+\:+(\d+)

it works perfectly fine there for following string

1::594::5::838984679

but the same doesnt work in hive:

create external table ratings8 (userid string, movieid string, rating string, timestamp string) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" = "(\d+)\:+(\d+)+\:+(\d+)+\:+(\d+)", "output.format.string" = "%1$s %2$s %3$s %4$s" ) LOCATION '/ratings';

can somebody help me out here ? what am I doing wrong ?

Upvotes: 0

Views: 414

Answers (1)

Explosion Pills
Explosion Pills

Reputation: 191729

You need to escape the backslash in the string (\\). The colon does not require an escape, though

(\\d+):+(\\d+)+:+(\\d+)+:+(\\d+)

It's also not necessary to have (\\d+)+ since this is equivalent to just \\d+.

Upvotes: 1

Related Questions