WebWanderer
WebWanderer

Reputation: 10867

Split a Date/Time String into Tokens with Regex

I am trying to break a date format string (and an input date string) into parts (tokens).

A date format can use any separating character between tokens, as long as the tokens are valid. The date/time will be parsed into the format and the non-alphanumeric characters will carry over into the output string. I would like to strip the tokens from both the date/time format string and the input date/time string.

I need to split a string on all sets of non-alphanumeric characters

For example, let's say my date format string is:

'dddd, MMMM D, YYYY h:mm A'

I would like to have an array of all of the date/time tokens in the string. My output should be:

['dddd', 'MMMM', 'D', 'YYYY', 'h', 'mm', 'A']

I tried a couple of attempts, but I still cannot wrap my head around how to do this. I was able to come up with the regex:

[^\w]|[_]

which should match any character not present in [a-zA-Z0-9_] as the first alternative or match _ literally as the second alternative.

I have tested this regex and it does work, yet how can I use it properly?

I tried to use:

var format_tokens = display_format.match(/[^\w]|[_]/g);

..which, of course, returned me:

format_tokens = [",", " ", " ", ",", " ", " ", ":", " "];

..which should have been obvious...

So how do I get the inverse? How can I get the desired result of:

format_tokens = ['dddd', 'MMMM', 'D', 'YYYY', 'h', 'mm', 'A']

?

Thanks everybody!

Upvotes: 1

Views: 1462

Answers (1)

Leah Zorychta
Leah Zorychta

Reputation: 13409

You want to match alphanumeric characters and split/exclude on everything else then, right? Which would be ([a-zA-Z0-9]+)

this:

'dddd, MMMM D, YYYY h:mm A'.match(/([a-zA-Z0-9]+)/g)

returns:

["dddd", "MMMM", "D", "YYYY", "h", "mm", "A"]

Upvotes: 6

Related Questions