Reputation: 4889

conditionally mutating column values using `dplyr`

I am using WRS2 to carry out robust pairwise comparisons. But one problem is that it removes the group level names from the output dataframes and saves it in a different object.

# setup
set.seed(123)
library(WRS2)
library(tidyverse)

# robust pairwise comparisons
x <- lincon(libido ~ dose, data = viagra, tr = 0.1)

# comparisons
x$comp
#>      Group Group psihat  ci.lower    ci.upper    p.value
#> [1,]     1     2   -1.0 -3.440879  1.44087853 0.25984505
#> [2,]     1     3   -2.8 -5.536161 -0.06383861 0.04914871
#> [3,]     2     3   -1.8 -4.536161  0.93616139 0.17288911

# vector with group level names
x$fnames
#> [1] "placebo" "low"     "high"

I can convert it to a tibble:

# converting to tibble
suppressMessages(as_tibble(x$comp, .name_repair = "unique")) %>%
  dplyr::rename(group1 = Group...1, group2 = Group...2) 
#> # A tibble: 3 x 6
#>   group1 group2 psihat ci.lower ci.upper p.value
#>    <dbl>  <dbl>  <dbl>    <dbl>    <dbl>   <dbl>
#> 1      1      2   -1      -3.44   1.44    0.260 
#> 2      1      3   -2.8    -5.54  -0.0638  0.0491
#> 3      2      3   -1.8    -4.54   0.936   0.173

I would then like to replace the group column numeric values with actual names included in fnames (so map fnames[1] -> 1, fnames[2] -> 2, and so on).

So the final dataframe should look something like the following-

#> # A tibble: 3 x 6
#>   group1 group2 psihat ci.lower ci.upper p.value
#>    <dbl>  <dbl>  <dbl>    <dbl>    <dbl>   <dbl>
#> 1      placebo      low   -1      -3.44   1.44    0.260 
#> 2      placebo      high   -2.8    -5.54  -0.0638  0.0491
#> 3      low      high   -1.8    -4.54   0.936   0.173

In this case, it was easy to just copy-paste the three values, but I want to have a generalizable approach where no matter the number of levels, it works. How can I do this using dplyr?

Upvotes: 3

Answers (4)

akrun

Reputation: 887711

Using a named vector to match with tidyverse. This matches by value and not by the sequence of index i.e. if the value in 'Group' columns are not in a sequence or character, this would still work

library(dplyr)
as_tibble(x$comp, .name_repair = 'unique') %>%
   mutate(across(starts_with("Group"), 
         ~ setNames(x$fnames, seq_along(x$fnames))[as.character(.)]))

Upvotes: 1

eipi10

Reputation: 93871

Here's an approach using the recode function, with the recoding vector built programmatically from the data:

# Setup
set.seed(123)
library(WRS2)
library(tidyverse)

x <- lincon(libido ~ dose, data = viagra, tr = 0.1)

# Create recoding vector
recode.vec = x$fnames %>% set_names(1:length(x$fnames))

# Recode columns
x.comp = x$comp %>% 
  as_tibble(.name_repair=make.unique) %>% 
  mutate(across(starts_with("Group"), ~recode(., !!!recode.vec)))

Output:

x.comp

#> # A tibble: 3 x 6
#>   Group   Group.1 psihat ci.lower ci.upper p.value
#>   <chr>   <chr>    <dbl>    <dbl>    <dbl>   <dbl>
#> 1 placebo low       -1      -3.44   1.44    0.260 
#> 2 placebo high      -2.8    -5.54  -0.0638  0.0491
#> 3 low     high      -1.8    -4.54   0.936   0.173

Upvotes: 2

Duck

Reputation: 39613

Try this tidyverse approach formating data to long after extracting the objects as tibbles. You can use left_join() to get your groups as you want. Here the code to get something close to what you want:

# setup
set.seed(123)
library(WRS2)
library(tidyverse)
# robust pairwise comparisons
x <- lincon(libido ~ dose, data = viagra, tr = 0.1)
#Transform to tibble
df1 <- suppressMessages(as_tibble(x$comp, .name_repair = "unique")) %>%
  dplyr::rename(group1 = Group...1, group2 = Group...2) 
#Extract labels
df2 <- tibble(treat=x$fnames) %>% mutate(value=1:n())
#Format to long df1
df1 <- df1 %>% 
  mutate(id=1:n()) %>%
  pivot_longer(cols = c(group1,group2)) %>%
  rename(group=name) %>% left_join(df2) %>% select(-value) %>%
  pivot_wider(names_from = group,values_from=treat) %>% select(-id)

Output:

# A tibble: 3 x 6
  psihat ci.lower ci.upper p.value group1  group2
   <dbl>    <dbl>    <dbl>   <dbl> <chr>   <chr> 
1   -1      -3.44   1.44    0.260  placebo low   
2   -2.8    -5.54  -0.0638  0.0491 placebo high  
3   -1.8    -4.54   0.936   0.173  low     high

Upvotes: 0

Waldi

Reputation: 41260

Does this fullfil your needs :

names <- c("A","B","C")

df = data.frame(group=c(1,2,3))
library(dplyr)
df %>% mutate(group = names[group])

  group
1     A
2     B
3     C

Upvotes: 1

conditionally mutating column values using `dplyr`

Answers (4)

Related Questions