Reputation: 864
Lets say I have this list of dataframes:
DF1_A<- data.frame (first_column = c("A", "B","C"),
second_column = c(5, 5, 5),
third_column = c(1, 1, 1)
)
DF1_B <- data.frame (first_column = c("A", "B","E"),
second_column = c(1, 1, 5),
third_column = c(1, 1, 1)
)
DF2_A <- data.frame (first_column = c("E", "F","G"),
second_column = c(1, 1, 5),
third_column = c(1, 1, 1)
)
DF2_B <- data.frame (first_column = c("K", "L","B"),
second_column = c(1, 1, 5),
third_column = c(1, 1, 1)
)
mylist <- list(DF1_A, DF1_B, DF2_A, DF2_B)
names(mylist) = c("DF1_A", "DF1_B", "DF2_A", "DF2_B")
mylist = lapply(mylist, function(x){
x[, "first_column"] <- as.character(x[, "first_column"])
x
})
I want to bind them by their name (All DF1, All DF2 etc), or, objectively, two by two in this ordered named list. Keeping the "named list structure" of the list is important to keep track (for example, DF1_A and DF1_B = DF1 or something similiar in the names(mylist))
There are some rows that have duplicated values, and I want to keep them (which will introduce some duplicated characters such as first_column, value A)
I have tried finding any clues here on stack overflow, but most people want to bind dataframes irrespective of their names or orders.
Final result would look something like this:
mylist
DF1
DF2
DF1
first_column second_column third_column
A 1 1
A 5 1
B 1 1
B 5 1
C 5 1
E 5 1
Upvotes: 2
Views: 70
Reputation:
One of many obligatory tidyverse
solutions can be this.
library(purrr)
library(stringr)
# find the unique DF names
unique_df <- set_names(unique(str_split_fixed(names(mylist), "_", 2)[,1]))
# loop over each unique name, extracting the elements and binding into columns
purrr::map(unique_df, ~ keep(mylist, str_starts(names(mylist), .x))) %>%
map(bind_rows)
Also for things like this, bind_rows()
from dplyr
has a .id
argument which will add a column with the list element name, and stack the rows. That can also be a helpful way. You can bind, manipulate the name how you'd like, and then split()
.
Upvotes: 0
Reputation: 102181
Do you mean something like this?
lapply(
split(mylist, gsub("_.*", "", names(mylist))),
function(v) `row.names<-`((out <- do.call(rbind, v))[do.call(order, out), ], NULL)
)
which gives
$DF1
first_column second_column third_column
1 A 1 1
2 A 5 1
3 B 1 1
4 B 5 1
5 C 5 1
6 E 5 1
$DF2
first_column second_column third_column
1 B 5 1
2 E 1 1
3 F 1 1
4 G 5 1
5 K 1 1
6 L 1 1
Upvotes: 3
Reputation: 76495
Here is a solution with Map
, but it only works for two suffixes. If you want to merge
, use the first Map
instruction; if you want to keep duplicates, use the 2nd, rbind
solution.
sp <- split(mylist, sub("^DF.*_", "", names(mylist)))
res1 <- Map(function(x, y)merge(x, y, all = TRUE), sp[["A"]], sp[["B"]])
res2 <- Map(function(x, y)rbind(x, y), sp[["A"]], sp[["B"]])
names(res1) <- sub("_.*$", "", names(res1))
names(res2) <- sub("_.*$", "", names(res2))
Upvotes: 1