Pattern for parsing date with two timezone format in Joda-Time

Question

I've a scenario where I'm getting date strings in various different patterns (from a third-party email server) (for example):

Mon, 13 Mar 2017 19:00:10 +0530 (IST)
Tue, 21 Mar 2017 09:23:00 -0700 (PDT)
Sun, 12 Mar 2017 14:31:13 +0000 (UTC)

That means, only the time-zones are being changed. I can easily parse this using Java's SimpleDateFormat, for example:

String pattern = "EEE, dd MMM yyyy HH:mm:ss Z '('z')'"
SimpleDateFormat df = new SimpleDateFormat(pattern);
df.parse("Fri, 31 Mar 2017 13:31:14 +0530 (IST)");

But when using DateTimeFormat from Joda-Time library, I'm not able to use the same pattern.

String pattern = "EEE, dd MMM yyyy HH:mm:ss Z '('z')'"
DateTimeFormat parser = DateTimeFormat.forPattern(pattern)
parser.parseDateTime("Fri, 31 Mar 2017 13:31:14 +0530 (IST)")

What I'm missing here?

Basil Bourque · Accepted Answer

tl;dr

String input = "Mon, 13 Mar 2017 19:00:10 +0530 (IST)";
int index = input.indexOf ( " (" ); // Searching for SPACE + LEFT PARENTHESIS.
String inputModified = input.substring ( 0 , index ); // "Mon, 13 Mar 2017 19:00:10 +0530"

Instant instant = 
    OffsetDateTime.parse ( 
        inputModified , 
        DateTimeFormatter.ofPattern( "EEE, d MMM uuuu HH:mm:ss Z" ) 
    ).toInstant() 
;

See similar code run live at IdeOne.com.

Using java.time

FYI: The Joda-Time project, now in maintenance mode, advises migration to the java.time classes.

two timezone format in Joda-Time

Mon, 13 Mar 2017 19:00:10 +0530 (IST)

No, that is a zero time zone format.

The +0530 is an offset-from-UTC, a number of hours and minutes away from UTC.

Specify a proper time zone name in the format of continent/region, such as America/Montreal, Africa/Casablanca, or Pacific/Auckland. Never use the 3-4 letter abbreviation such as EST or IST as they are not true time zones, not standardized, and not even unique(!).

Since the 3-4 letter abbreviations cannot be reliably parsed, Joda-Time has a policy of refusing to try (as noted in comment by Hugo above). I suspect this is a wise policy, given what we see next.

The java.time classes will make an attempt to guess at parsing such pseudo-time-zone names but may not be your intended value. Indeed, it interprets inappropriately your first example, interpreting IST apparently as Israel Standard Time out of the choices that include India Standard Time, Ireland Standard Time, and possibly more.

String input = "Mon, 13 Mar 2017 19:00:10 +0530 (IST)";
DateTimeFormatter f = DateTimeFormatter.ofPattern( "EEE, d MMM uuuu HH:mm:ss Z '('z')'") ;
ZonedDateTime zdt = ZonedDateTime.parse ( input , f );

zdt.toString(): 2017-03-13T19:00:10+02:00[Asia/Jerusalem]

So I suggest you lop off the bogus abbreviation chunk at the end. Parse the remaining text as an OffsetDateTime which at least gives you an exact moment on the timeline. Adjust into UTC as an Instant, as most of your work should generally be done in UTC including your logging.

Lop off the abbreviation using String::substring. Note we are including the SPACE before the LEFT PARENTHESIS in our substring search as we want to delete both characters and everything after that.

String input = "Mon, 13 Mar 2017 19:00:10 +0530 (IST)";
int index = input.indexOf ( " (" ); // Searching for SPACE + LEFT PARENTHESIS.
String inputModified = input.substring ( 0 , index );

inputModified: Mon, 13 Mar 2017 19:00:10 +0530

Parse as an OffsetDateTime object using the numerical offset at the end to guide us as to the exact moment of this value.

DateTimeFormatter f = DateTimeFormatter.ofPattern( "EEE, d MMM uuuu HH:mm:ss Z" );
OffsetDateTime odt = OffsetDateTime.parse ( inputModified , f );

odt.toString(): 2017-03-13T19:00:10+05:30

Extract an Instant object to give us the same moment in UTC.

Instant instant = odt.toInstant ();

instant.toString(): 2017-03-13T13:30:10Z

You can adjust into your own particular time zone if you insist. But I advise learning to think in UTC when wearing your Programmer hat. Think of UTC as “The One True Time” and all other zones are mere variations on that theme.

ZoneId z = ZoneId.of( "America/Montreal" );
ZonedDateTime zdt = instant.atZone( z );

ISO 8601

The kind of pattern shown in your examples was common in protocols of yesteryear such as RFC 1123 / RFC 822.

Nowadays, the approach is to always use ISO 8601. In this modern standard, the formats are easy to read across various human cultures, have less reliance on the English language, are easy for machines to parse, and are designed to be unambiguous.

The java.time classes use ISO 8601 by default when generating/parsing strings. You can see their generated output in my examples above. Note that ZonedDateTime extends the standard by appending the name of the time zone in square brackets.

By the way, if you have similar inputs that comply exactly with RFC 1123, know that java.time provides a predefined formatter object, DateTimeFormatter.RFC_1123_DATE_TIME.

Pattern for parsing date with two timezone format in Joda-Time

Answers (1)

tl;dr

Using java.time

ISO 8601

Related Questions