Ways to full_join with alternating "non-joined duplicate variables" instead of .x,.x,.x,.y,.y,.y format, and corresponding calculation

Question

library(dplyr)
x1 <- tibble(ID = rep(1,4),
             Measure = c("Height","Weight","Width","Length"),
             AD1= rpois(4,10),
             AD2= rpois(4,9))
x2 <- tibble(ID = rep(2,4),
             Measure = c("Height","Weight","Width","Length"),
             AD1= rpois(4,20),
             AD2= rpois(4,19))

xdiff <- x1 %>%
  full_join(x2,by="Measure",
            suffix=c("_1","_2")) %>%
  mutate(diff1 = AD1_1-AD1_2,
         diff2 = AD2_1-AD2_2)

   ID_1 Measure AD1_1 AD2_1  ID_2 AD1_2 AD2_2 diff1 diff2
            
1     1 Height     10    12     2    20    15   -10    -3
2     1 Weight      7    10     2    11    16    -4    -6
3     1 Width      12    11     2    22    17   -10    -6
4     1 Length     11     9     2    24    16   -13    -7

xdesire <- xdiff %>%
  select(Measure,AD1_1,AD1_2,diff1,AD2_1,AD2_2,diff2)

  Measure AD1_1 AD1_2 diff1 AD2_1 AD2_2 diff2
          
1 Height     10    20   -10    12    15    -3
2 Weight      7    11    -4    10    16    -6
3 Width      12    22   -10    11    17    -6
4 Length     11    24   -13     9    16    -7

full_join(), with default arguments, will add suffixes to the duplicate variables in sequential order of x,x,x,y,y,y. I could technically use select() to order the columns as desired, but if the data have large amount of columns, e.g., AD1:AD200, it would be very inefficient.

dply: order columns alphabetically in R shows method of order the columns alphabetically, but in my case, there is "difference" column that is needed between every variables (AD1,diff1,AD2,diff).

I was wondering if full_join() allows duplicate variables to be alternating, and if I could directly add the calculations with them through join() function.

Of course, other methods, such as more advanced select() argument is welcomed too.

akrun · Accepted Answer

An option is to order on the first numeric substring extracted with parse_number on the column names

library(dplyr)
xdiff %>%
  select(Measure, order(readr::parse_number(names(.))),  -starts_with("ID"))

-output

# A tibble: 4 x 7
#  Measure AD1_1 AD1_2 diff1 AD2_1 AD2_2 diff2
#          
#1 Height     10    25   -15     5    20   -15
#2 Weight     16    28   -12    13    15    -2
#3 Width      12    19    -7     9    25   -16
#4 Length     10    29   -19     4    23   -19

Ways to full_join with alternating "non-joined duplicate variables" instead of .x,.x,.x,.y,.y,.y format, and corresponding calculation

Answers (2)

Related Questions

Ways to full_join with alternating &quot;non-joined duplicate variables&quot; instead of .x,.x,.x,.y,.y,.y format, and corresponding calculation

Answers (2)

Related Questions

Ways to full_join with alternating "non-joined duplicate variables" instead of .x,.x,.x,.y,.y,.y format, and corresponding calculation