Erdem Akca
Erdem Akca

Reputation: 41

How can I delete numbers and characters from my date and convert a character column to a date column

I have a dataframe with the column name perioden. This column contains the date but it is written in this format: 2010JJ00, 2011JJ00, 2012JJ00, 2013JJ00 etc.. This column is also a character when I look at the structure. I've tried multiple solutions but so far am still stuck, my qeustion is how can I convert this column to a date and how do I remove the JJ00 part so that you only see the year format of the column.

Upvotes: 1

Views: 68

Answers (2)

akrun
akrun

Reputation: 886938

We can use ymd with truncated option

library(lubridate)
library(stringr)
ymd(str_remove(df$Date, 'JJ\\d+'), truncated = 2)
#[1] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"

data

df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'), stringsAsFactors = FALSE)

Upvotes: 0

Duck
Duck

Reputation: 39585

You can try this approach. Using gsub() to remove the non desired text (as said by @AllanCameron) and then format to date using paste0() to add the day and month, and as.Date() for date transformation:

#Data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'),stringsAsFactors = F)
#Remove string
df$Date <- gsub('JJ00','',df$Date)
#Format to date, you will need a day and month
df$Date2 <- as.Date(paste0(df$Date,'-01-01'))

Output:

  Date      Date2
1 2010 2010-01-01
2 2011 2011-01-01
3 2012 2012-01-01
4 2013 2013-01-01

Upvotes: 1

Related Questions