Reputation: 85
I wish convert some dates in Norwegian to actual dates in R. I'm using readr
, and it kind of works - but I stumbled upon an issue which really annoys me, and I don't really know how to get around it.
Here is an illustration of my problem:
> parse_date(c("29. mai 2017", "29. sep 2017"), format = "%d. %b %Y", locale = locale("nn"))
Warning: 1 parsing failure.
row # A tibble: 1 x 4 col row col expected actual expected <int> <int> <chr> <chr> actual 1 2 NA date like %d. %b %Y 29. sep 2017
[1] "2017-05-29" NA
So it catches the date in May but not the one in September. It turns out that this is because the abbreviation for September in Norwegian needs a "." (sep.
instead of sep
), whereas the May abbreviations does not (probably because it's actually not an abbreviation ;-)):
locale("nb")
<locale>
Numbers: 123,456.78
Formats: %AD / %AT
Timezone: UTC
Encoding: UTF-8
<date_names>
Days: søndag (søn.), mandag (man.), tirsdag (tir.), onsdag (ons.), torsdag(tor.), fredag (fre.), lørdag (lør.)
Months: januar (jan.), februar (feb.), mars (mar.), april (apr.), mai (mai), juni (jun.), juli (jul.), august (aug.), september (sep.), oktober (okt.), november (nov.), desember (des.)
AM/PM: a.m./p.m.
However it seems inconsistent that it will not require the same number of charterers for all months. I also noticed that these annoying "." are not a part of the abbreviations in English:
> locale("en")
<locale>
Numbers: 123,456.78
Formats: %AD / %AT
Timezone: UTC
Encoding: UTF-8
<date_names>
Days: Sunday (Sun), Monday (Mon), Tuesday (Tue), Wednesday (Wed), Thursday (Thu), Friday (Fri), Saturday (Sat)
Months: January (Jan), February (Feb), March (Mar), April (Apr), May (May), June (Jun), July (Jul), August (Aug),
September (Sep), October (Oct), November (Nov), December (Dec)
AM/PM: AM/PM
It really is terrible inconvenient also because I believe it is somewhat rare to actually include the "." at all when registration dates with abbreviations (but that is really just based on personal preferences and experience). Any input is much appreciated.
Upvotes: 4
Views: 263
Reputation: 85
A solution similar to and inspired @Andrew Gustar is creating your own date_names object:
loc <- locale("nb")
myNo <- date_names(mon = loc$date_names$mon,
mon_ab = substr(loc$date_names$mon_ab, 1, 3),
day = loc$date_names$day,
day_ab = substr(loc$date_names$day, 1, 3))
parse_date(c("29. mai 2017", "29. sep 2017"), format = "%d. %b %Y", locale = locale(date_names = myNo))
[1] "2017-05-29" "2017-09-29"
Upvotes: 0
Reputation: 18435
You can edit the locale manually like this...
loc <- locale("nb")
loc$date_names$mon_ab <- substr(loc$date_names$mon_ab, 1, 3) #just take first 3 characters
parse_date(c("29. mai 2017", "29. sep 2017"), format = "%d. %b %Y", locale = loc)
[1] "2017-05-29" "2017-09-29"
Upvotes: 1