Rebel_47
Rebel_47

Reputation: 79

How do i convert my date values into year in r

another day with new complex faced

Below are the columns and rows that I have as input:

ID  Age
123 23 Years 1 Month 2 Days
125 28 Years 9 Month 14 Days
126 28 years
127 34 YEAR
128 35 Years 8 Month 21 Days
129 38 Years 5 Month 25 Days
130 32.8

I need them as yearly calculated in new columns like:

ID  Age                      Age_new
123 23 Years 1 Month 2 Days     23.1
125 28 Years 9 Month 14 Days    28.9
126 28 years                    28
127 34 YEAR                     34
128 35 Years 8 Month 21 Days    35.8
129 38 Years 5 Month 25 Days    38.5
130 32.8                        32.8

I have tried the by stringr package but I get only first character string which doesn't provide like the above.

Upvotes: 0

Views: 667

Answers (2)

Joe Christian
Joe Christian

Reputation: 11

This is my approach. I always try to avoid regex since it's too scary for me. If your data is exactly separated like your example, I think my code will work. I completely understand this is not the most efficient way. but heyy it works

dat %>% 
  mutate(space_counter = stringr::str_count(Age," ")) %>% 
  tidyr::separate(Age,into = paste0("tmp_col_",1:(max(.$space_counter)+1)),sep = " ") %>% 
  select(ID, tmp_col_1,tmp_col_3,tmp_col_5) %>% 
  setNames(c("ID","year","month","day")) %>% 
  mutate(across(everything(), ~replace_na(.x, 0))) %>% 
  mutate_if(is.character,as.integer) %>% 
  mutate(asdur = as.duration(years(year) + months(month) + days(day))) %>% 
  mutate(age_new = as.numeric(asdur)/3.154e+7)

output: enter image description here

Upvotes: 0

r2evans
r2evans

Reputation: 160437

Here's a gross approximation:

func <- function(x, ptn) {
  out <- gsub(paste0(".*?\\b([0-9.]+)\\s*", ptn, ".*"), "\\1", x, ignore.case = TRUE)
  ifelse(out == x, NA, out)
}

library(dplyr)
dat %>%
  mutate(
    data.frame(
      lapply(c(yr = "year", mon = "month", day = "day"),
             function(ptn) as.numeric(func(Age, ptn)))
    ),
    yr = if_else(is.na(yr), suppressWarnings(as.numeric(Age)), yr),
    across(c(yr, mon, day), ~ coalesce(., 0)), New_Age = yr + mon/12 + day/365
  )
#    ID                      Age   yr mon day  New_Age
# 1 123  23 Years 1 Month 2 Days 23.0   1   2 23.08881
# 2 125 28 Years 9 Month 14 Days 28.0   9  14 28.78836
# 3 126                 28 years 28.0   0   0 28.00000
# 4 127                  34 YEAR 34.0   0   0 34.00000
# 5 128 35 Years 8 Month 21 Days 35.0   8  21 35.72420
# 6 129 38 Years 5 Month 25 Days 38.0   5  25 38.48516
# 7 130                     32.8 32.8   0   0 32.80000

(I offer no warranty on true accuracy.)


Data

dat <- structure(list(ID = c(123L, 125L, 126L, 127L, 128L, 129L, 130L), Age = c("23 Years 1 Month 2 Days", "28 Years 9 Month 14 Days", "28 years", "34 YEAR", "35 Years 8 Month 21 Days", "38 Years 5 Month 25 Days", "32.8")), class = "data.frame", row.names = c(NA, -7L))

Upvotes: 1

Related Questions