dkone
dkone

Reputation: 183

dplyr::select - using column more than once?

select(mtcars,foo=mpg,bar=mpg)

This will return a data frame with just one column - bar. It appears dplyr discards previous occurrences of a column, making multiple aliases for the same column impossible. Bug? Design? Workaround?

Upvotes: 3

Views: 1184

Answers (5)

akrun
akrun

Reputation: 887118

We can use

library(tidyverse)
library(rlang)
map2(c('mpg', 'mpg'), c('foo', 'bar'), ~ mtcars %>% 
          select(!! .y := !! rlang::sym(.x))) %>% 
  bind_cols

Or another option is replicate the selected columns and set the names to the desired one

replicate(2, mtcars %>%
                   select(mpg))  %>%
      set_names(c('foo', 'bar')) %>%
      bind_cols

Upvotes: 0

5th
5th

Reputation: 2375

I don't see why everybody is using dplyr for the workaround. Base R is much faster:

UPDATED: I wrote myfun4 and myfun3 in base R. The former is scalable. The latter isn't. The other four functions are the solutions with dplyr. The benchmark shows dplyr is slower by more than a factor of ten:

microbenchmark::microbenchmark(myfun1(),myfun2(),myfun3(),myfun4(),myfun5(),myfun6())
Unit: microseconds
     expr    min      lq      mean  median       uq     max neval
 myfun1() 5356.6 5739.90  6320.338 5967.45  6327.75 11177.7   100
 myfun2() 6208.1 6676.55  7220.770 6941.10  7172.55 10936.3   100
 myfun3() 8645.3 9299.30 10287.908 9676.30 10312.85 15837.1   100
 myfun4() 4426.1 4712.40  5405.235 4866.65  5245.20 12573.2   100
 myfun5()  168.6  250.05   292.472  270.70   303.15  2119.3   100
 myfun6()  141.7  203.15   341.079  237.00   256.45  6278.0   100

The code:

myfun6<-function(){
n=2
res_l<-lapply(1:n,function(j) mtcars$mpg)
res<-data.frame(do.call(cbind,res_l))
rownames(res)=rownames(mtcars)
colnames(res)=c('foo','bar')
}

myfun5<-function(){
res<-data.frame(foo=mtcars$mpg,bar=mtcars$mpg)  
}

myfun4<-function(){
  mtcars %>% 
  select(foo=mpg) %>% 
  bind_cols(bar=.$foo)
}

myfun3<-function(){
res<-map2(c('mpg', 'mpg'), c('foo', 'bar'), ~ mtcars %>% 
          select(!! .y := !! rlang::sym(.x))) %>% 
  bind_cols
}

myfun2<-function(){
  res<-transmute(mtcars, foo = mpg, bar = mpg)
}

myfun1<-function(){
  res<-mtcars %>% 
  select(foo = mpg) %>% 
  mutate(bar = foo)
}

Upvotes: 0

Roman
Roman

Reputation: 17648

you can also do

mtcars %>% 
  select(foo=mpg) %>% 
  bind_cols(bar=.$foo)

or

mtcars %>% 
  bind_cols(foo=.$mpg, bar=.$mpg)  
  select(foo, bar)

Upvotes: 0

Weihuang Wong
Weihuang Wong

Reputation: 13118

You could do transmute(mtcars, foo = mpg, bar = mpg) (with the caveat that this drops the row names).

Upvotes: 1

phiver
phiver

Reputation: 23598

workaround: add a mutate that uses foo to create bar.

mtcars %>% 
  select(foo = mpg) %>% 
  mutate(bar = foo)

Upvotes: 1

Related Questions