Vitor Quix
Vitor Quix

Reputation: 174

Extract values from a string

If I have these strings:

df$value[1] = "3d 4H 59M"
df$value[2] = "7d 10H 46M"
df$value[3] = "12d 2H 4M"

d = days
H = Hours
M = Minutes

As you can see, the record sometimes gives days with 2 numbers, hours with 1 number. The normal is 1 to 2 numbers in each type, D, H, M. How can I extract the values ​​of each D, H, M in this situation?

Data

x <- c("3d 4H 59M", "7d 10H 46M", "12d 2H 4M")

Upvotes: 1

Views: 586

Answers (3)

akrun
akrun

Reputation: 886948

We can do this more easier with base R

m1 <- matrix(scan(text=gsub("[^0-9]+", ",", x), what=numeric(),
        sep=",", quiet = TRUE), nrow =3, byrow= TRUE)[,-4]
colnames(m1) <- c("Days", "Hours", "Minutes")
m1
#     Days Hours Minutes
#[1,]    3     4      59
#[2,]    7    10      46
#[3,]   12     2       4

Or another option is with tidyverse by first converting the vector into data_frame, separate into the three columns and extract the number with parse_number

library(tidyverse)
data_frame(x = x) %>% 
     separate(x, into = c("Days", "Hours", "Minutes")) %>% 
     mutate_all(readr::parse_number)
# A tibble: 3 x 3
#   Days Hours Minutes
#  <dbl> <dbl>   <dbl>
#1     3     4      59
#2     7    10      46
#3    12     2       4

data

x <- c("3d 4H 59M", "7d 10H 46M", "12d 2H 4M")

Upvotes: 0

h3rm4n
h3rm4n

Reputation: 4187

With base R:

v <- c("3d 4H 59M", "7d 10H 46M", "12d 2H 4M")

l <- lapply(strsplit(v, " "), function(v) as.numeric(sub("([0-9]+).*", "\\1", v)))

df <- setNames(do.call(rbind.data.frame, l), c("days","hours","minutes"))

you get:

> df
  days hours minutes
1    3     4      59
2    7    10      46
3   12     2       4

Upvotes: 1

akuiper
akuiper

Reputation: 214927

You can use stringr::str_match:

library(stringr)

values = c("3d 4H 59M", "7d 10H 46M", "12d 2H 4M")

dhm <- str_match(values, "([0-9]{1,2})d ([0-9]{1,2})H ([0-9]{1,2})M")[,-1]
storage.mode(dhm) <- "integer"
colnames(dhm) <- c("Days", "Hours", "Minutes")

dhm
#     Days Hours Minutes
#[1,]    3     4      59
#[2,]    7    10      46
#[3,]   12     2       4

Upvotes: 5

Related Questions