Reputation: 93
(I actually came up with a solution but that didn't satisfy my desire for simplicity and intuitiveness, therefore here I state my question and solution while waiting for a nice and neat solution.)
I have a data with one column being Year
and the other being Month
, while the month is in the format of string:
Country Month Year Type
<fct> <chr> <dbl> <fct>
1 Argentina June 1975 Currency
2 Argentina February 1981 Currency
3 Argentina July 1982 Currency
I am trying to combine the Month and Year column to a single column Date
, which is in the format of date
.
My first try was to use mapply
, with the help of lubridate
and a little function of my that transforms month
from string to int.
months = c("January", "February", "March", "April", 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December')
month_num = c(1:12)
names(month_num) = months
crisis$Date = mapply(function(y, m){
m = month_num[m]
d = make_date(y,m)
return(d)
},crisis$Year, crisis$Month)
However this didn't turn out to be what I want:
Country Month Year Type Date
<fct> <chr> <dbl> <fct> <list>
1 Argentina June 1975 Currency <date [1]>
2 Argentina February 1981 Currency <date [1]>
3 Argentina July 1982 Currency <date [1]>
4 Argentina September 1986 Currency <date [1]>
, as the Date
column is list format.
With some help from this post and some manipulation on unlisting it and turning it back to date object, I managed to get the result I want:
crisis$Date = as_date(unlist(mapply(function(y, m){
m = month_num[m]
d = make_date(y,m)
return(d)
},crisis$Year, crisis$Month, SIMPLIFY = FALSE)))
The result is
Country Month Year Type Date
<fct> <chr> <dbl> <fct> <date>
1 Argentina June 1975 Currency 1975-06-01
2 Argentina February 1981 Currency 1981-02-01
3 Argentina July 1982 Currency 1982-07-01
4 Argentina September 1986 Currency 1986-09-01
This is so far fine to deal with, but I believe there are better solutions.
Upvotes: 2
Views: 1667
Reputation: 93
So after the help from @Gram and @det, I came up with my solution.
I am a new learner in R so I didn't realize some of the R-ish style of handling datas, therefore tried to make every thing done in one single line of code. Thanks to some tips from Gram's answer, I somehow learned to clear my code by adding auxilary columns instead (which is similar to excel).
Consider that there might be situations in the future where the correspondence might not simply be from 1:12 to months, and to make things more general for future utilization, I create a new data.frame
just to store all the information about months:
month_ref = data.frame(num = 1:12, Month = c("January", "February", "March", "April", 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'))
num Month
1 1 January
2 2 February
3 3 March
4 4 April
Now the idea is to "combine" the two dataframes, matching the Month
column to numerical numbers. This is exactly like the VLOOKUP
function in excel, and with help from this post, I now have a dataframe with a column of numbers
crisis = crisis %>%
inner_join(month_ref, by=c("Month"))
Country Month Year Type num
<fct> <chr> <dbl> <fct> <int>
1 Argentina June 1975 Currency 6
2 Argentina February 1981 Currency 2
3 Argentina July 1982 Currency 7
4 Argentina September 1986 Currency 9
I can then handle my dataframe with a neat column of month in number, which is much more easier and readable than handling the parsing in a custom function in mutate()
.
crisis = crisis %>%
inner_join(month_ref, by="Month") %>%
mutate(
Date = lubridate::ymd(paste(Year, num, "01", sep="-"))
) %>%
select(-c(num, Month, Year))
Country Type Date
<fct> <fct> <date>
1 Argentina Currency 1975-06-01
2 Argentina Currency 1981-02-01
3 Argentina Currency 1982-07-01
Upvotes: 1
Reputation: 683
You can convert month to a number, and then from there to a date:
df %>%
mutate(
Month = base::match(Month, base::month.name),
Date = as.Date(paste(Year, '-', Month, '-01', sep=''))
) %>%
select(-c(Month, Year))
# A tibble: 3 x 3
# Country Type Date
# <chr> <chr> <date>
# 1 Argentina Currency 1975-06-01
# 2 Argentina Currency 1981-02-01
# 3 Argentina Currency 1982-07-01
Does this help?
I provided the dataframe below:
library(tibble)
df <- tibble(
Country = 'Argentina',
Month = c('June', 'February', 'July'),
Year = c(1975, 1981, 1982),
Type = 'Currency'
)
Upvotes: 2