Reputation: 131

How to compare the increase or decrease of multiple measurements two-by-two?

I have a data where I would like to compare the measurements two-by-two by taking the difference and say if the difference is more than 0.2 add another column as Yes, otherwise No. Like the example data in the image. And, at the end for each individual if there is at least one Yes, the final is Yes. I have filled the first row manually, but I would like to do the same for all my individuals(1000 individuals).enter image description here

library(readxl)
dd <- read_excel("dd.xlsx")
dd <- dput(dd)
structure(list(ID = c(1, 2, 3, 4, 5, 6), m1 = c("2.1", "1.4","NA", "4.0", "2.5", "NA"), 
               m2 = c("2.8", "1.5", "NA", "4.0", "3.8", "1.1"),
               m3 = c("3.5", "1.5", "NA", "4.0", "NA", "1.3"), 
               m4 = c("NA", "1.8", "1.8", "4.0", "NA", "1.3"), 
               m5 = c("NA","1.5", "2.9", "NA", "NA", "1.5"), 
               m6 = c("NA", "NA", "3.5", "NA", "NA", "1.2")), 
          class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L))

Upvotes: 2

Answers (2)

TarJae

Reputation: 78937

Here is another tidyverse approach:

library(dplyr)
library(tibble)

dd %>%
  type.convert(as.is =TRUE) %>% 
  select(-1) %>% 
  mutate(across(-1, .names = "{.col}-{names(.)[match(.col, 
         names(.))-1]}") - across(-last_col())) %>% 
  mutate(across(contains("-"), ~ifelse(is.na(.), NA, "YES"), .names = "r({.col})")) %>% 
  mutate(Final = ifelse(rowSums(dd == "YES") > 0, "No", "Yes")) %>% 
  add_column(ID = dd$ID, .before = "m1")

     ID    m1    m2    m3    m4    m5    m6 `m2-m1` `m3-m2` `m4-m3` `m5-m4` `m6-m5` `r(m2-m1)` `r(m3-m2)` `r(m4-m3)` `r(m5-m4)` `r(m6-m5)` Final
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <chr>      <chr>      <chr>      <chr>      <chr>      <chr>
1     1   2.1   2.8   3.5  NA    NA    NA     0.7       0.7    NA      NA      NA   YES        YES        NA         NA         NA         Yes  
2     2   1.4   1.5   1.5   1.8   1.5  NA     0.100     0       0.3    -0.3    NA   YES        YES        YES        YES        NA         Yes  
3     3  NA    NA    NA     1.8   2.9   3.5  NA        NA      NA       1.1     0.6 NA         NA         NA         YES        YES        Yes  
4     4   4     4     4     4    NA    NA     0         0       0      NA      NA   YES        YES        YES        NA         NA         Yes  
5     5   2.5   3.8  NA    NA    NA    NA     1.3      NA      NA      NA      NA   YES        NA         NA         NA         NA         Yes  
6     6  NA     1.1   1.3   1.3   1.5   1.2  NA         0.2     0       0.2    -0.3 NA         YES        YES        YES        YES        Yes

Upvotes: 2

langtang

Reputation: 24722

This approach uses a pivot longer to estimate your columns, and then pivot wider with rename will give you what you want (See second solution). However, if you just want the Final column, by ID, just do this:

Simple approach to just getting Final by ID

dd %>% 
  pivot_longer(cols = !ID) %>%
  mutate(value=as.numeric(value)) %>% 
  group_by(ID) %>% 
  summarize(Final = any(value-lag(value)>0.2))

Extended approach if you really want all those intermediate variables

dd %>% 
  pivot_longer(cols = !ID) %>%
  mutate(value=as.numeric(value)) %>% 
  group_by(ID) %>% 
  mutate(diff=value-lag(value),
         gt=if_else(diff>0.2, "Yes", "No")) %>% 
  pivot_wider(id_cols = ID,names_from = name,values_from = value:gt) %>% 
  rowwise() %>% 
  mutate(Final = any(c_across(starts_with("gt"))=="Yes")) %>% 
  select(!c(diff_m1,gt_m1)) %>% 
  rename_with(~c("ID", "m1","m2","m3","m4","m5","m6",
                 "m2-m1", "m3-m2", "m4-m3", "m5-m4", "m6-m5",
                 "r(m2-m1)", "r(m3-m2)", "r(m4-m3)", "r(m5-m4)", "r(m6-m5)",
                 "Final"), .cols = everything()
  )

Output:

     ID    m1    m2    m3    m4    m5    m6 `m2-m1` `m3-m2` `m4-m3` `m5-m4` `m6-m5` `r(m2-m1)` `r(m3-m2)` `r(m4-m3)`
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <chr>      <chr>      <chr>     
1     1   2.1   2.8   3.5  NA    NA    NA     0.7       0.7    NA      NA      NA   Yes        Yes        NA        
2     2   1.4   1.5   1.5   1.8   1.5  NA     0.100     0       0.3    -0.3    NA   No         No         Yes       
3     3  NA    NA    NA     1.8   2.9   3.5  NA        NA      NA       1.1     0.6 NA         NA         NA        
4     4   4     4     4     4    NA    NA     0         0       0      NA      NA   No         No         No        
5     5   2.5   3.8  NA    NA    NA    NA     1.3      NA      NA      NA      NA   Yes        NA         NA        
6     6  NA     1.1   1.3   1.3   1.5   1.2  NA         0.2     0       0.2    -0.3 NA         No         No        
# … with 3 more variables: `r(m5-m4)` <chr>, `r(m6-m5)` <chr>, Final <lgl>

Upvotes: 2

How to compare the increase or decrease of multiple measurements two-by-two?

Answers (2)

Related Questions