LA_
LA_

Reputation: 20409

How to parse non-english text date in Javascript?

I have date in the text format like:

6 weeks ago, 2012 April 18 15:08:18
13 weeks ago, 2012 March 01 17:33:52

The main problem is that this texts are really in Russian, so instead of weeks ago there is the same text in Russian. And the same is with months (looks like I should create some dictionary of possible values).

I don't know how to start. Should I use regular expressions? Something else?

Upvotes: 3

Views: 446

Answers (1)

Tomasz Nurkiewicz
Tomasz Nurkiewicz

Reputation: 340763

Not Russian, but Polish:

var dateStr = "6 tygodni temu, 2012 kwiecień 18 15:08:18"

Firefox has no problems in extracting Unicode characters (quick & dirty regular expression):

var regex = /(\d+) ty.* temu, (\d+) (.*) (\d+) (\d{2}):(\d{2}):(\d{2})/

Parsing:

var result = dateStr.match(regex);

The result is:

[
  "6 tygodni temu, 2012 kwiecień 18 15:08:18",
  "6",
  "2012",
  "kwiecień",
  "18", 
  "15",
  "08",
  "18"
]

I don't know Russian, but you might need to do some extra linguistic work. E.g. in Polish I have "1 tydzień" but "2 tygodnie" and even "5 tygodni" (mind the different form).

Upvotes: 2

Related Questions