Reputation: 174
If I have these strings:
df$value[1] = "3d 4H 59M"
df$value[2] = "7d 10H 46M"
df$value[3] = "12d 2H 4M"
d = days
H = Hours
M = Minutes
As you can see, the record sometimes gives days with 2 numbers, hours with 1 number. The normal is 1 to 2 numbers in each type, D, H, M. How can I extract the values of each D, H, M in this situation?
Data
x <- c("3d 4H 59M", "7d 10H 46M", "12d 2H 4M")
Upvotes: 1
Views: 586
Reputation: 886948
We can do this more easier with base R
m1 <- matrix(scan(text=gsub("[^0-9]+", ",", x), what=numeric(),
sep=",", quiet = TRUE), nrow =3, byrow= TRUE)[,-4]
colnames(m1) <- c("Days", "Hours", "Minutes")
m1
# Days Hours Minutes
#[1,] 3 4 59
#[2,] 7 10 46
#[3,] 12 2 4
Or another option is with tidyverse
by first converting the vector
into data_frame
, separate
into the three columns and extract the number with parse_number
library(tidyverse)
data_frame(x = x) %>%
separate(x, into = c("Days", "Hours", "Minutes")) %>%
mutate_all(readr::parse_number)
# A tibble: 3 x 3
# Days Hours Minutes
# <dbl> <dbl> <dbl>
#1 3 4 59
#2 7 10 46
#3 12 2 4
x <- c("3d 4H 59M", "7d 10H 46M", "12d 2H 4M")
Upvotes: 0
Reputation: 4187
With base R:
v <- c("3d 4H 59M", "7d 10H 46M", "12d 2H 4M")
l <- lapply(strsplit(v, " "), function(v) as.numeric(sub("([0-9]+).*", "\\1", v)))
df <- setNames(do.call(rbind.data.frame, l), c("days","hours","minutes"))
you get:
> df
days hours minutes
1 3 4 59
2 7 10 46
3 12 2 4
Upvotes: 1
Reputation: 214927
You can use stringr::str_match
:
library(stringr)
values = c("3d 4H 59M", "7d 10H 46M", "12d 2H 4M")
dhm <- str_match(values, "([0-9]{1,2})d ([0-9]{1,2})H ([0-9]{1,2})M")[,-1]
storage.mode(dhm) <- "integer"
colnames(dhm) <- c("Days", "Hours", "Minutes")
dhm
# Days Hours Minutes
#[1,] 3 4 59
#[2,] 7 10 46
#[3,] 12 2 4
Upvotes: 5