Reputation: 63
Imagine you have this data frame
x <- c("a1", "a2", "a3", "a4", "a1", "a2", "a3", "a4")
y <- c("red", "yellow", "blue", "green", "black", "pink", "purple",
"orange")
df <- data.frame(x, y, stringsAsFactors = FALSE)
I cannot think of a way, preferably using dplyr, to extract the y column after grouping the data frame. Essentially I'd like to know what colors are in a1, in a2, in a3, and in a4, and store those results as separate vectors, ideally in a list.
I could do
colors.in.a1 <- df %>% filter(x == "a1") %>% pull(y)
for each of a1, a2, a3, a4, but that would take awhile with my real data. I was hoping that pull()
would behave like tally()
, perhaps returning a list of vectors that are named based on the grouping variable, but it doesn't.
Upvotes: 2
Views: 1485
Reputation: 18681
With Base R only (thanks to @thelatemail's comment):
split(df$y, df$x)
or we can use nest
:
library(tidyverse)
df %>%
group_by(x) %>%
nest() %>%
mutate(data = data %>% map(pull, y)) %>%
pull(data) %>%
setNames(unique(x))
Result:
$a1
[1] "red" "black"
$a2
[1] "yellow" "pink"
$a3
[1] "blue" "purple"
$a4
[1] "green" "orange"
Upvotes: 2
Reputation: 4768
Another solution using dplyr
and purrr
:
library(dplyr)
library(purrr)
df %>%
split(.$x) %>%
map(pull, y)
$a1 [1] "red" "black" $a2 [1] "yellow" "pink" $a3 [1] "blue" "purple" $a4 [1] "green" "orange"
Data:
df <- structure(list(x = c("a1", "a2", "a3", "a4", "a1", "a2", "a3",
"a4"), y = c("red", "yellow", "blue", "green", "black", "pink",
"purple", "orange")), class = "data.frame", row.names = c(NA,
-8L))
Upvotes: 2