Reputation: 2093
I'm working with a Pig script trying to convert a string to a datetime object using ToDate(). Here's a sample string that I'm working with Fri Nov 01 12:30:19 EDT 2013
When I try to convert it to a datetime object using ToDate(userstring, format) I get told that I'm using an invalid format...
B = FOREACH A GENERATE ToDate(date,'EEE MMM dd HH:mm:ss z yyyy') AS datetime;
ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2999: Unexpected internal error. Invalid format: "Fri Nov 01 12:30:19 EDT 2013" is malformed at "EDT 2013"
The reason, I strongly suspect, is that Pig uses Joda Time, and EDT is not a valid Joda Time time zone. No problem, according to the documentation, Pig uses Simple Date Format and I can escape strings (see the very first example). Except that I can't seem to do that...
ToDate(date,"EEE MMM dd HH:mm:ss 'EDT' yyyy") <-- unexpected character '"'
ToDate(date,'"EEE MMM dd HH:mm:ss 'EDT' yyyy"') <-- expecting semicolon error
ToDate(date,'EEE MMM dd HH:mm:ss 'EDT' yyyy') <-- expecting semicolon
ToDate(date,'EEE MMM dd HH:mm:ss \'EDT\' yyyy') <-- malformed at " EDT 2013"
ToDate(date,'EEE MMM dd HH:mm:ss "EDT" yyyy') <-- Illegal pattern component: T
etc. I'm pretty sure I've tried every combination of quotes and escape characters trying to get pig to ignore the "EDT" characters, but nothing seems to work (most of the above was just shooting in the dark).
Two questions I have before I go open a bug report or something on this. 1) Am I correct that this is failing because EDT isn't a supported timezone? Or is my pattern wrong somewhere? 2) If it is failing because of EDT, is there a way to escape those characters, or am I doing something wrong at this step?
Upvotes: 1
Views: 931
Reputation: 7193
You cannot parse EDT with JODA, you can with JDK. EDT is ambiguous and can have different values.
You should be interested in this other question on StackOverflow Pattern to parse this string to a DateTime
Upvotes: 0