Reputation: 19
I have a list of data frames, where each data frame has a row with a unique number assigned to it. Each data frame should have an equal number of rows, with the same unique row numbers (or, "row names"). However, some data frames are missing row names. I want to find the data frame that contains all row names. Then, I want to add these missing rows to the other data frames, adding them by their row names. It is fine if these appended rows contain NA
, so long as they have the right row number. I have an example of what the final output should look like at the bottom of this post.
Below is an example of the list of data frames, with missing rows in my_data2
and my_data3
:
my_data <- data.frame(
M1 = c(148.2, 149.5, 148.4, 154.5, 151.1),
M2 = c(148.4, 150.1, 154.2, NA, 156.9),
M3 = c(155.6, 150.1, NA, NA, 157.1),
M4 = c(155.7, 153.9, NA, NA, NA)
)
my_data2 <- data.frame(
M1 = c(148.2, 149.5, 148.4, 154.5),
M2 = c(148.4, 150.1, 154.2, NA),
M3 = c(155.6, 150.1, NA, NA),
M4 = c(155.7, 153.9, NA, NA)
)
my_data3 <- data.frame(
M1 = c(148.2, 149.5, 157.1),
M2 = c(148.4, 150.1, 156.9),
M3 = c(155.6, 150.1, 157.1),
M4 = c(155.7, 153.9, NA)
)
Rownames1 = c("1", "2", "3", "4", "5")
Rownames2 = c("1", "2", "3", "4")
Rownames3 = c("1", "2", "5") # Skipping to #5 was intentional here
rownames(my_data) <- Rownames1
rownames(my_data2) <- Rownames2
rownames(my_data3) <- Rownames3
my_list <- list(my_data,my_data2,my_data3)
I am unsure of how to append the data frames and add the missing row names. Ideally, this operation would be performed over the list of data frames. So long as the original row names in each of the data frames are not changed, the order of the newly added row names does not matter. I have an example of the final output below:
print(my_list[[1]]) # Contains all row names / numbers
# M1 M2 M3 M4
# 1 148.2 148.4 155.6 155.7
# 2 149.5 150.1 150.1 153.9
# 3 148.4 154.2 NA NA
# 4 154.5 NA NA NA
# 5 151.1 156.9 157.1 NA
print(my_list[[2]]) # Initially missing row name / number 5
# M1 M2 M3 M4
# 1 148.2 148.4 155.6 155.7
# 2 149.5 150.1 150.1 153.9
# 3 148.4 154.2 NA NA
# 4 154.5 NA NA NA
# 5 NA NA NA NA
print(my_list[[3]]) # Initially missing row names / numbers 3 and 4
# M1 M2 M3 M4
# 1 148.2 148.4 155.6 155.7
# 2 149.5 150.1 150.1 153.9
# 5 157.1 156.9 157.1 NA
# 3 NA NA NA NA
# 4 NA NA NA NA
Upvotes: 0
Views: 400
Reputation: 389175
You can use Reduce
to get the unique rownames from all the list. Use setdiff
to get the rownames which are missing in the dataframe and add those rows with NA
.
all_rownames <- Reduce(union, lapply(my_list, rownames))
lapply(my_list, function(x) {x[setdiff(all_rownames, rownames(x)), ] <- NA;x})
#[[1]]
# M1 M2 M3 M4
#1 148.2 148.4 155.6 155.7
#2 149.5 150.1 150.1 153.9
#3 148.4 154.2 NA NA
#4 154.5 NA NA NA
#5 151.1 156.9 157.1 NA
#[[2]]
# M1 M2 M3 M4
#1 148.2 148.4 155.6 155.7
#2 149.5 150.1 150.1 153.9
#3 148.4 154.2 NA NA
#4 154.5 NA NA NA
#5 NA NA NA NA
#[[3]]
# M1 M2 M3 M4
#1 148.2 148.4 155.6 155.7
#2 149.5 150.1 150.1 153.9
#5 157.1 156.9 157.1 NA
#3 NA NA NA NA
#4 NA NA NA NA
Upvotes: 1