Reputation: 555
I have a dataframe df
structure(list(X = 1:12, id = structure(c(2L, 7L, 5L, 4L, 3L,
1L, 6L, 8L, 9L, 10L, 11L, 12L), .Label = c("B12", "B7", "C2",
"C9", "D3", "E2", "E6", "R4", "T2", "T3", "T7", "U9"), class = "factor"),
age = c(42L, 45L, 83L, 59L, 49L, 46L, 52L, 23L, 24L, 25L,
26L, 27L)), class = "data.frame", row.names = c(NA, -12L))
I have split the people in the above dataframe into a list of 3 matrices called list_mat
list(Blue_Banana = structure(c("B7", "E6", "D3", "C9"), .Dim = c(2L,
2L), .Dimnames = list(NULL, c("target", "partner"))), Gold_Apple = structure(c("C2",
"B12", "E2", "R4"), .Dim = c(2L, 2L), .Dimnames = list(NULL,
c("target", "partner"))), Blue_Orange = structure(c("T2",
"T3", "T7", "U9"), .Dim = c(2L, 2L), .Dimnames = list(NULL, c("target",
"partner"))))
I would like to group the matrices based on keywords in the names of their matrix objects which can be found with
names(list_mat)
I then use a function from the igraph
package to calculate in-degree
list_graph= lapply(list_mat, graph_from_edgelist, directed=TRUE)
cent_list= lapply(list_graph, centr_degree, mode="in")
To get the actual in-degree scores I'll use the first matrix object inside list_mat
as an example
cent_list[[1]]$res
To get the ID of who the indegree scores refer to I'll again just use the first matrix inside list_mat
as an example
V(list_graph[[1]])$name
I want the in-degree scores of all the values in the matrices inside list_mat
that contain the string "Blue" in their name to be matched with their ID in the original dataframe df
and this will create a column called "Blue" that contains the in-degree scores of the ID's in those matrices. I then want to do the same thing with all the matrices that have "Gold" in the name of the matrix objects inside list_mat
(there's only one matrix with "Gold" in the name) The final output will look something like the picture at the bottom, but the numbers may be different.
Upvotes: 1
Views: 71
Reputation: 48241
Clearly there are multiple ways to achieve this; here's one. First,
(blues <- grep("Blue", names(list_graph)))
# [1] 1 3
determines which graphs are about "Blue". Then
(db <- degree(Reduce(`+`, list_graph[blues]), mode = "in"))
# B7 D3 E6 C9 T2 T7 T3 U9
# 0 1 0 1 0 1 0 1
(do <- degree(Reduce(`+`, list_graph[-blues]), mode = "in"))
# C2 E2 B12 R4
# 0 1 0 1
are the in-degrees of the two groups. To insert this into df
we may use base R's merge
as in
merge(merge(df, data.frame(Blue = db), by.x = "id", by.y = "row.names", all.x = TRUE),
data.frame(Gold = do), by.x = "id", by.y = "row.names", all.x = TRUE)
# id X age Blue Gold
# 1 B12 6 46 NA 0
# 2 B7 1 42 0 NA
# 3 C2 5 49 NA 0
# 4 C9 4 59 1 NA
# 5 D3 3 83 1 NA
# 6 E2 7 52 NA 1
# 7 E6 2 45 0 NA
# 8 R4 8 23 NA 1
# 9 T2 9 24 0 NA
# 10 T3 10 25 0 NA
# 11 T7 11 26 1 NA
# 12 U9 12 27 1 NA
which gives a result with NA's, but actually that may be more adequate as in this way it's clear to which group the row belongs. Otherwise, e.g., in rows 9 and 10 it wouldn't be clear.
More generally, we may do
keywords <- c("Blue", "Gold", "Red", "Purple") # Assuming all those are present
for(k in keywords) {
idx <- grep(k, names(list_graph))
deg <- degree(Reduce(`+`, list_graph[idx]), mode = "in")
df <- merge(df, data.frame(deg), by.x = "id", by.y = "row.names", all.x = TRUE)
names(df)[ncol(df)] <- k
}
One part worth explaining is
Reduce(`+`, list_graph[idx])
Instead of combining different "Blue" degrees from different graphs, I first combine graphs, as in g1 + g2
(yes, it works), where the resulting graph has two components g1
and g2
, and then compute degrees of this super-graph. Now Reduce
allows to sum up in this way any number of graphs, i.e., it does g1 + g2 + ... + gk
for all the graphs in list_graph[idx]
.
Upvotes: 2