Niklas
Niklas

Reputation: 25391

Java Time parse Dates with short day names

I've got the following German date: So, 18 Jul 2021 15:24:00 +0200

I'm unable to parse it using Java Time:

DateTimeFormatter.ofPattern("EEE, dd MMM yyyy HH:mm:ss Z", Locale.GERMANY)
  .parse("So, 18 Jul 2021 15:24:00 +0200", Instant::from)

as it throws: Text 'So, 18 Jul 2021 15:24:00 +0200' could not be parsed at index 0

If I were to change the string to be properly formatted it works:

-So, 18 Jul 2021 15:24:00 +0200
+So., 18 Juli 2021 15:24:00 +0200

Is there any magic pattern to parse the above date?


I've also got the same problem for other dates

Upvotes: 3

Views: 826

Answers (2)

Arvind Kumar Avinash
Arvind Kumar Avinash

Reputation: 79015

The modern Date-Time API is very particular about the pattern. So, it is almost impossible to create a single pattern that you can use to parse all types of strings. However, one of the greatest features of DateTimeFormatter is its flexibility to work with optional patterns, specified using the square bracket e.g. the following demo uses E, d [MMMM][MMM][M] u H:m:s Z which has three optional patterns for the month.

Demo:

import java.time.DateTimeException;
import java.time.Instant;
import java.time.format.DateTimeFormatter;
import java.util.Locale;
import java.util.stream.Stream;


public class Main {
    public static void main(String[] args) {
        Stream.of(
                "So., 18 Juli 2021 15:24:00 +0200",
                "ven., 16 avr. 2021 15:24:00 +0200",
                "vr, 16 apr. 2021 15:24:00 +0200",
                "vr, 16 07 2021 15:24:00 +0200"
        ).forEach(s -> {
            Stream.of(
                    Locale.GERMANY,
                    Locale.FRANCE,
                    new Locale("nl", "NL")
            ).forEach( locale -> {
                try {
                    System.out.println("Parsed '" + s + "' using the locale, " + locale + " => " + parseToInstant(s, locale));
                }catch(DateTimeException e) {
                    //....
                }
            });
        });
    }

    static Instant parseToInstant(String strDateTime, Locale locale) {
        return DateTimeFormatter.ofPattern("E, d [MMMM][MMM][M] u H:m:s Z").withLocale(locale).parse(strDateTime,
                Instant::from);
    }
}

Output:

Parsed 'So., 18 Juli 2021 15:24:00 +0200' using the locale, de_DE => 2021-07-18T13:24:00Z
Parsed 'ven., 16 avr. 2021 15:24:00 +0200' using the locale, fr_FR => 2021-04-16T13:24:00Z
Parsed 'vr, 16 apr. 2021 15:24:00 +0200' using the locale, nl_NL => 2021-04-16T13:24:00Z
Parsed 'vr, 16 07 2021 15:24:00 +0200' using the locale, nl_NL => 2021-07-16T13:24:00Z

ONLINE DEMO

Learn more about the Date-Time patterns from DateTimeFormatterBuilder.

Upvotes: 3

Anonymous
Anonymous

Reputation: 86223

Specify your own abbreviations for the days of the week

According to CLDR German day of week abbreviations are written with a dot. To have Java parse a string where the abbreviations lacks the dot there are two obvious solutions:

  1. Don’t use CLDR. Java’s own abbreviations from Java 8 and before did not have the dots and are still available in newer Java versions.
  2. Specify your own abbreviations.

Since you had similar problems with French, where Java’s own abbreviations have dots too, I suggest that solution 1. would be insufficient for you. So let’s delve into solution 2. My code below takes CLDR’s abbreviations, e.g., So., and removes the trailing dots from them, so you get for example So as in your string.

    Locale loc = Locale.GERMANY;
    Map<Long,String> dowsWithoutDots = Arrays.stream(DayOfWeek.values())
            .collect(Collectors.toMap(dow -> Long.valueOf(dow.getValue()),
                    dow -> dow.getDisplayName(TextStyle.SHORT, loc).replaceFirst("\\.$", "")));
    Map<Long,String> monthsWithoutDots = Arrays.stream(Month.values())
            .collect(Collectors.toMap(m -> Long.valueOf(m.getValue()),
                    m -> m.getDisplayName(TextStyle.SHORT, loc).substring(0, 3)));
    DateTimeFormatter germanWithoutDots = new DateTimeFormatterBuilder()
            .appendText(ChronoField.DAY_OF_WEEK, dowsWithoutDots)
            .appendPattern(", dd ")
            .appendText(ChronoField.MONTH_OF_YEAR, monthsWithoutDots)
            .appendPattern(" yyyy HH:mm:ss Z")
            .toFormatter(loc);
    
    System.out.println(germanWithoutDots.parse("So, 18 Jul 2021 15:24:00 +0200", Instant::from));

Output from the snippet is:

2021-07-18T13:24:00Z

For the month abbreviations removing the final dot did not work since, as you have observed, CLDR’s abbreviation is Juli where you have got Jul. So instead of removing the dot I abbreviate to three characters. You should test that it works for all months (including Mai).

I have not tried the same for French and Dutch, but it should work.

In case you want to try your luck with solution 1., circumventing CLDR completely, see JDK dateformatter parsing DayOfWeek in German locale, java8 vs java9.

Upvotes: 2

Related Questions