user3077008
user3077008

Reputation: 847

Extracting and Splitting numbers and characters from string in R

I am trying to split extract and split numbers and characters from strings. I also want to remove a few characters and numbers at the end of each string. For example, I have following strings.

dm<-c("2December2005MOMENT55", "3December2005ROYALS56", "1July2012ANGELS57")

I want to make them as

Day Month    Year
2   December 2005
3   December 2005
1   July     2012

Split and extract the values and put them under different variables.

I was trying this with strsplit command. But I couldn't proceed enough. I am really sorry I don't have codes for this.

I hope can have any command or code suggestions. Thank you!

Upvotes: 2

Views: 153

Answers (2)

johannes
johannes

Reputation: 14433

Here is a regex solution:

library(stringr)
str_match(dm, "(^[0-9]{1,3})([A-z]+)([0-9]{4})")[, 2:4]
##      [,1] [,2]       [,3]  
## [1,] "2"  "December" "2005"
## [2,] "3"  "December" "2005"
## [3,] "1"  "July"     "2012"

Upvotes: 1

mnel
mnel

Reputation: 115382

  1. Convert to a date object (format '%d%B%Y' ( given the provided example))
  2. Use year, mday and month to get the data.frame you want

df <- data.frame(string = dm, date = as.Date(dm,format = '%d%B%Y'))
df[c('Day','Month','Year')] <- with(df, list(mday(date), 
                                             month.name[month(date)],
                                             year(date)))

Upvotes: 4

Related Questions