Dinho
Dinho

Reputation: 724

Use REGEX in R to extract specific string in value as a new column?

I have a column that contains string of characters/values that looks like this

Current

111111~24-JUL-17 10:43:36~6.14

Desired Output

24-JUL-17 10:43:36

Hoping to take everything between the '~' --> So Date/Time and disregard everything else.

I am have this code right now but only seems to take part of it

df$Last <- gsub(".+\\s(.+)$", "\\1", df$col1)

Upvotes: 1

Views: 44

Answers (2)

akrun
akrun

Reputation: 886948

We can use sub in base R

df$c1 <- sub(".*~([^~]+)~.*", "\\1", df$c1)
df$c1
#[1] "24-JUL-17 10:43:36" "24-JUL-21 10:34:36"

data

df <- data.frame(c1 = c('111111~24-JUL-17 10:43:36~6.14',
       '111111~24-JUL-21 10:34:36~6.14'))

Upvotes: 1

Karthik S
Karthik S

Reputation: 11584

We can use tidyr's separate to get below result:

library(dplyr)
library(tidyr)
df <- data.frame(c1 = c('111111~24-JUL-17 10:43:36~6.14','111111~24-JUL-21 10:34:36~6.14'))
df
                              c1
1 111111~24-JUL-17 10:43:36~6.14
2 111111~24-JUL-21 10:34:36~6.14
df %>% separate(col = c1, into = c('x','Date','y'), sep = '~') %>% select(2)
                Date
1 24-JUL-17 10:43:36
2 24-JUL-21 10:34:36
 

Using stringr package:

library(dplyr)
library(stringr)
df %>% mutate(c1 = str_extract(c1, '(?<=~).*(?=~)'))
                  c1
1 24-JUL-17 10:43:36
2 24-JUL-21 10:34:36

Upvotes: 2

Related Questions