Reputation: 2252
How can I use the pipe operator to pipe into replacement function like colnames()<-
?
Here's what I'm trying to do:
library(dplyr)
averages_df <-
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp))
colnames(averages_df) <- c("cyl", "disp_mean", "hp_mean")
averages_df
# Source: local data frame [3 x 3]
#
# cyl disp_mean hp_mean
# 1 4 105.1364 82.63636
# 2 6 183.3143 122.28571
# 3 8 353.1000 209.21429
But ideally it would be something like:
averages_df <-
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
add_colnames(c("cyl", "disp_mean", "hp_mean"))
Is there a way to do this without writing a specialty function each time?
The answers here are a start, but not exactly my question: Chaining arithmetic operators in dplyr
Upvotes: 102
Views: 85111
Reputation: 67828
You could use colnames<-
or setNames
(thanks to @David Arenburg)
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
`colnames<-`(c("cyl", "disp_mean", "hp_mean"))
# or
# `names<-`(c("cyl", "disp_mean", "hp_mean"))
# setNames(c("cyl", "disp_mean", "hp_mean"))
# cyl disp_mean hp_mean
# 1 4 105.1364 82.63636
# 2 6 183.3143 122.28571
# 3 8 353.1000 209.21429
Or pick an Alias
(set_colnames
) from magrittr
:
library(magrittr)
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
set_colnames(c("cyl", "disp_mean", "hp_mean"))
dplyr::rename
may be more convenient if you are only (re)naming a few out of many columns (it requires writing both the old and the new name; see @Richard Scriven's answer)
Upvotes: 156
Reputation: 47350
This would also work :
set <- function(fun) {
match.fun(paste0(deparse(substitute(fun)), "<-"))
}
library(dplyr, w = F)
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
set(colnames)(c("cyl", "disp_mean", "hp_mean"))
#> # A tibble: 3 × 3
#> cyl disp_mean hp_mean
#> <dbl> <dbl> <dbl>
#> 1 4 105. 82.6
#> 2 6 183. 122.
#> 3 8 353. 209.
Created on 2022-11-23 with reprex v2.0.2
Upvotes: 1
Reputation: 8823
We can add a suffix to the summarised variables by using .funs
argument of summarise_at
with dplyr as below code.
library(dplyr)
# summarise_at with dplyr
mtcars %>%
group_by(cyl) %>%
summarise_at(
.cols = c("disp", "hp"),
.funs = c(mean="mean")
)
# A tibble: 3 × 3
# cyl disp_mean hp_mean
# <dbl> <dbl> <dbl>
# 1 4 105.1364 82.63636
# 2 6 183.3143 122.28571
# 3 8 353.1000 209.21429
Also, we can set column names in several ways.
# set_names with magrittr
mtcars %>%
group_by(cyl) %>%
summarise(mean(disp), mean(hp)) %>%
magrittr::set_names(c("cyl", "disp_mean", "hp_mean"))
# set_names with purrr
mtcars %>%
group_by(cyl) %>%
summarise(mean(disp), mean(hp)) %>%
purrr::set_names(c("cyl", "disp_mean", "hp_mean"))
# setNames with stats
mtcars %>%
group_by(cyl) %>%
summarise(mean(disp), mean(hp)) %>%
stats::setNames(c("cyl", "disp_mean", "hp_mean"))
# A tibble: 3 × 3
# cyl disp_mean hp_mean
# <dbl> <dbl> <dbl>
# 1 4 105.1364 82.63636
# 2 6 183.3143 122.28571
# 3 8 353.1000 209.21429
Upvotes: 13
Reputation: 99381
In dplyr
, there are a couple different ways to rename the columns.
One is to use the rename()
function. In this example you'd need to back-tick the names created by summarise()
, since they are expressions.
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
rename(disp_mean = `mean(disp)`, hp_mean = `mean(hp)`)
# cyl disp_mean hp_mean
# 1 4 105.1364 82.63636
# 2 6 183.3143 122.28571
# 3 8 353.1000 209.21429
You could also use select()
. This is a bit easier because we can use the column number, eliminating the need to mess around with back-ticks.
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
select(1, disp_mean = 2, hp_mean = 3)
But for this example, the best way would be to do what @thelatemail mentioned in the comments, and that is to go back one step and name the columns in summarise()
.
group_by(mtcars, cyl) %>%
summarise(disp_mean = mean(disp), hp_mean = mean(hp))
Upvotes: 29