Reputation: 915
Creating a new column using mutate which is some function of the contents of a specified set of columns for each row in a data frame.
This seems like it should be a simple task but I've been struggling to find the right syntax something like:
df <- data.frame("annotations"=c("some","information","in","columns"),
"X001"=c(124,435,324,123),
"X002"=c(486,375,156,375))
df %>% mutate(median=median(select(.,starts_with("X"))))
So I get the original data frame with a new column 'median' which has the median across all columns starting with 'X' for each row. I think I might need a rowwise()
in there somewhere.
I'm trying to fit this into a larger dplyr pipeline so I'm looking for solutions within the 'tidyverse'
Upvotes: 0
Views: 338
Reputation: 188
Another way which doesn't include the use of dplyr
library(data.table)
# columns starts with X
df[,names(df) %like% "X"]
# output
X001 X002
1 124 486
2 435 375
3 324 156
4 123 375
# get the median for each row using apply function
apply(df[,names(df) %like% "X"], 1, median)
#output - median of each row
305 405 240 249
# store the results in a new column
df$median = apply(df[,names(df) %like% "X"],1,median)
# output
annotations X001 X002 median
1 some 124 486 305
2 information 435 375 405
3 in 324 156 240
4 columns 123 375 249
Upvotes: 0
Reputation: 28675
You can pmap
over the X
columns
library(tidyverse)
df %>%
mutate(median = pmap_dbl(select(., starts_with("X"))
, ~median(c(...))))
Or use apply
df %>%
mutate(median = apply(select(., starts_with("X")), 1, median))
Upvotes: 1