Reputation: 57
Let's say I have one dataframe that represents a subset brief. Let's call this dataframe brief
. Here, the values are the subset rules.
library(tidyverse)
brief <- data.frame(apple = 1, orange = 2, pear = 3)
In a second dataframe, called types
, this is the data I'd like to subset.
types <- data.frame(apple_cake = rnorm(5,0,1),
apple_pie = rnorm(5,0, 1),
apple_ice = rnorm(5,0, 1),
orange_cake = rnorm(5,0,1),
orange_pie = rnorm(5, 0, 1),
orange_ice = rnorm(5,0, 1),
pear_cake = rnorm(5,0,1),
pear_pie = rnorm(5, 0, 1),
pear_ice = rnorm(5,0, 1)
)
In my example, I'd like to subset specific apple, orange, and pear food items using dataframe one against dataframe two. I'd like to use the values from the first dataframe brief
to subset the columns from the second dataframe types
.
I would like to end up with a final dataframe as follows:
final <- types %>%
select(apple_cake, orange_pie, pear_ice)
In the final solution, apple item 1, orange item 2, and pear item 3 remain.
I've tried various dplyr functions but to no avail. I have also looked for similar solutions but do not think they address my problem because these examples might have similar column names across two dataframes.
Upvotes: 5
Views: 88
Reputation: 887501
In base R
, we split the data columns into a list
of data.frame having similar prefix by splitting on the suffix removed column names, Extract ([
) the columns from the corresponding index in brief
data with Map
and cbind
the columns in the list
with do.call
do.call(cbind, Map(`[`, split.default(types, trimws(names(types),
whitespace = "_.*")), brief))
-output
apple_cake orange_pie pear_ice
1 0.9065347 0.661994002 0.1988529
2 1.0429968 0.736754956 0.9714787
3 0.5517950 -0.003860467 0.4486231
4 -0.4304989 1.160817987 -0.3410640
5 -0.1701651 0.489398131 -1.1846220
Or another option is to loop over the data.frame brief
with imap
, select
the columns with a select-helper (starts_with
) on the names of the brief
data column (.y
) and then select
based on the value index of the brief
column (.x
)
library(dplyr)
library(purrr)
imap_dfc(brief, ~ types %>%
select(starts_with(.y)) %>%
select(all_of(.x)))
apple_cake orange_pie pear_ice
1 0.9065347 0.661994002 0.1988529
2 1.0429968 0.736754956 0.9714787
3 0.5517950 -0.003860467 0.4486231
4 -0.4304989 1.160817987 -0.3410640
5 -0.1701651 0.489398131 -1.1846220
Upvotes: 2