Reputation: 41
I have a dataframe with the column name perioden. This column contains the date but it is written in this format: 2010JJ00, 2011JJ00, 2012JJ00, 2013JJ00 etc.. This column is also a character when I look at the structure. I've tried multiple solutions but so far am still stuck, my qeustion is how can I convert this column to a date and how do I remove the JJ00 part so that you only see the year format of the column.
Upvotes: 1
Views: 68
Reputation: 886938
We can use ymd
with truncated
option
library(lubridate)
library(stringr)
ymd(str_remove(df$Date, 'JJ\\d+'), truncated = 2)
#[1] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'), stringsAsFactors = FALSE)
Upvotes: 0
Reputation: 39585
You can try this approach. Using gsub()
to remove the non desired text (as said by @AllanCameron) and then format to date using paste0()
to add the day and month, and as.Date()
for date transformation:
#Data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'),stringsAsFactors = F)
#Remove string
df$Date <- gsub('JJ00','',df$Date)
#Format to date, you will need a day and month
df$Date2 <- as.Date(paste0(df$Date,'-01-01'))
Output:
Date Date2
1 2010 2010-01-01
2 2011 2011-01-01
3 2012 2012-01-01
4 2013 2013-01-01
Upvotes: 1