jim
jim

Reputation: 9138

Parse this date time string with regex

So I have the following string containing a date and time, which I need to parse

« by username on September 13, 2015, 08:34:02 am »

I have the following expression which seems to work in rubular.com but Java only collects September from it.

I would also like to have two groups, the date and the time. How can I do this?

January|February|March|April|May|June|July|August|September|October|November|December| [0-9]{2}, [0-9]{4}, [0-9]{2}:[0-9]{2}:[0-9]{2} am|pm

Thanks

Upvotes: 0

Views: 436

Answers (3)

Kennet
Kennet

Reputation: 5796

One could try something like this

String in = "by username on September 13, 2015, 08:34:02 am";
        //date parsing pattern
    String s = "MMM d, yyyy, HH:mm:ss aaa";
    SimpleDateFormat sdf = new SimpleDateFormat(s, Locale.US);
    try {
        //pattern to get rid of 'by username on'
        String p = "\\w+\\s\\w+\\s\\w+\\s";
        Date d = sdf.parse(in.replaceFirst(p, ""));
        System.out.println(d);
    } catch (ParseException e) {
        e.printStackTrace();
    }

Upvotes: 3

jCoder
jCoder

Reputation: 2319

If the date is always entered in exactly the same format you could use a function like the following. If you expect more spaces between the parts, then add \s+ (escaped as \\s+ in Java string).

public static Date findAndParseDate(String s) {
    Date parsedDate = null;
    String patternStr = "((January|February|March|April|May|June|July|August|September|October|November|December) [0-9]{2}, [0-9]{4}, [0-9]{2}:[0-9]{2}:[0-9]{2} am|pm)";
    Pattern p = Pattern.compile(patternStr);
    Matcher m = p.matcher(s);
    if (m.find()) {
        String extractedDateTimePart = m.group(1);
        SimpleDateFormat simpleDateFormat = new SimpleDateFormat("MMM dd, yyyy, hh:mm:ss aa");
        try {
            parsedDate = simpleDateFormat.parse(extractedDateTimePart);
        } catch (Exception ex) {
        }
    }
    return parsedDate;
}

Upvotes: 0

greenfrvr
greenfrvr

Reputation: 643

Try this one.

((?:January|February|March|April|May|June|July|August|September|October|November|December)\s[0-9]{2},\s[0-9]{4}),\s([0-9]{2}:[0-9]{2}:[0-9]{2}\sam|pm)

Tested on your expression, it captures both date and time into separate groups.

Upvotes: 0

Related Questions