Reputation: 53
I'm new to R, please help. I have a data frame with 5 column with names Seasondate and V1, V2,V3, V4. The season dates have different format of dates and are around 1000 observations like:
January to March
August to October
05/01/2013 to 10/30/2013
NA
February to June
02/15/2013 to 06/19/2013
I would like to bring all of them into one format. Like bringing them them into all in one format of Month to Month.
Parsing with string functions would be highly appreciated
Edit 1:
All of them have the same year of 2013 Thanks
Upvotes: 1
Views: 76
Reputation: 99331
Here's another idea that doesn't use date coercion, but uses the month.name
vector from base R.
## change the column to character
df$V1 <- as.character(df$V1)
## find the numeric values
g <- grepl("\\d", df$V1)
## split them, get the months, then apply 'month.name' and paste
df$V1[g] <- vapply(strsplit(df$V1[g], " to "), function(x) {
paste(month.name[as.integer(sub("/.*", "", x))], collapse = " to ")
}, "")
Resulting in
df
V1
1 January to March
2 August to October
3 May to October
4 <NA>
5 February to June
6 February to June
Original Data:
df <- structure(list(V1 = structure(c(5L, 3L, 2L, NA, 4L, 1L), .Label = c("02/15/2013 to 06/19/2013",
"05/01/2013 to 10/30/2013", "August to October", "February to June",
"January to March"), class = "factor")), .Names = "V1", class = "data.frame", row.names = c(NA,
-6L))
Upvotes: 1
Reputation: 93813
Do some formatting back and forth using as.Date
and format
, then paste
it all together again:
datext <- function(x) {
dates <- as.Date(x,format="%m/%d/%Y")
if(all(is.na(dates))) x else format(dates,"%B")
}
vapply(
lapply(strsplit(as.character(dat$Seasondate), " to "), datext),
paste, collapse=" to ", FUN.VALUE=character(1)
)
#[1] "January to March" "August to October" "May to October"
#[4] "NA" "February to June" "February to June"
Upvotes: 2