Merging list of dataframes by name of list object

Question

I have a dataframe df

structure(list(X = 1:12, id = structure(c(2L, 7L, 5L, 4L, 3L, 
1L, 6L, 8L, 9L, 10L, 11L, 12L), .Label = c("B12", "B7", "C2", 
"C9", "D3", "E2", "E6", "R4", "T2", "T3", "T7", "U9"), class = "factor"), 
    age = c(42L, 45L, 83L, 59L, 49L, 46L, 52L, 23L, 24L, 25L, 
    26L, 27L)), class = "data.frame", row.names = c(NA, -12L))

I have split the people in the above dataframe into a list of 3 matrices called list_mat

list(Blue_Banana = structure(c("B7", "E6", "D3", "C9"), .Dim = c(2L, 
2L), .Dimnames = list(NULL, c("target", "partner"))), Gold_Apple = structure(c("C2", 
"B12", "E2", "R4"), .Dim = c(2L, 2L), .Dimnames = list(NULL, 
    c("target", "partner"))), Blue_Orange = structure(c("T2", 
"T3", "T7", "U9"), .Dim = c(2L, 2L), .Dimnames = list(NULL, c("target", 
"partner"))))

I would like to group the matrices based on keywords in the names of their matrix objects which can be found with

names(list_mat)

I then use a function from the igraph package to calculate in-degree

list_graph= lapply(list_mat, graph_from_edgelist, directed=TRUE)
cent_list= lapply(list_graph, centr_degree, mode="in")

To get the actual in-degree scores I'll use the first matrix object inside list_mat as an example

cent_list[[1]]$res

To get the ID of who the indegree scores refer to I'll again just use the first matrix inside list_mat as an example

V(list_graph[[1]])$name

I want the in-degree scores of all the values in the matrices inside list_mat that contain the string "Blue" in their name to be matched with their ID in the original dataframe df and this will create a column called "Blue" that contains the in-degree scores of the ID's in those matrices. I then want to do the same thing with all the matrices that have "Gold" in the name of the matrix objects inside list_mat (there's only one matrix with "Gold" in the name) The final output will look something like the picture at the bottom, but the numbers may be different.

Julius Vainora · Accepted Answer

Clearly there are multiple ways to achieve this; here's one. First,

(blues <- grep("Blue", names(list_graph)))
# [1] 1 3

determines which graphs are about "Blue". Then

(db <- degree(Reduce(`+`, list_graph[blues]), mode = "in"))
# B7 D3 E6 C9 T2 T7 T3 U9 
#  0  1  0  1  0  1  0  1 
(do <- degree(Reduce(`+`, list_graph[-blues]), mode = "in"))
#  C2  E2 B12  R4 
#   0   1   0   1

are the in-degrees of the two groups. To insert this into df we may use base R's merge as in

merge(merge(df, data.frame(Blue = db), by.x = "id", by.y = "row.names", all.x = TRUE),
      data.frame(Gold = do), by.x = "id", by.y = "row.names", all.x = TRUE)
#     id  X age Blue   Gold
# 1  B12  6  46   NA      0
# 2   B7  1  42    0     NA
# 3   C2  5  49   NA      0
# 4   C9  4  59    1     NA
# 5   D3  3  83    1     NA
# 6   E2  7  52   NA      1
# 7   E6  2  45    0     NA
# 8   R4  8  23   NA      1
# 9   T2  9  24    0     NA
# 10  T3 10  25    0     NA
# 11  T7 11  26    1     NA
# 12  U9 12  27    1     NA

which gives a result with NA's, but actually that may be more adequate as in this way it's clear to which group the row belongs. Otherwise, e.g., in rows 9 and 10 it wouldn't be clear.

More generally, we may do

keywords <- c("Blue", "Gold", "Red", "Purple") # Assuming all those are present
for(k in keywords) {
  idx <- grep(k, names(list_graph))
  deg <- degree(Reduce(`+`, list_graph[idx]), mode = "in")
  df <- merge(df, data.frame(deg), by.x = "id", by.y = "row.names", all.x = TRUE)
  names(df)[ncol(df)] <- k
}

One part worth explaining is

Reduce(`+`, list_graph[idx])

Instead of combining different "Blue" degrees from different graphs, I first combine graphs, as in g1 + g2 (yes, it works), where the resulting graph has two components g1 and g2, and then compute degrees of this super-graph. Now Reduce allows to sum up in this way any number of graphs, i.e., it does g1 + g2 + ... + gk for all the graphs in list_graph[idx].

Merging list of dataframes by name of list object

Answers (1)

Related Questions