Reputation: 157
I'm struggling to figure out how to collapse 2 edges between the same 2 nodes into 1 and then calculate the sum of these edges.
I believe there's a way of doing it in igraph
:
simplify(gcon, edge.attr.comb = list(weight = "sum", function(x)length(x)))
but I'd like to do it with tidygraph
if possible as I've had success in implementing up to this point with tidygraph
and I'm much more familiar with the tidyverse
way of working.
My data looks like this:
from to Strength Dataframe Question Topic
1 0 32 4 weekly 1 Connection Frequency
2 0 19 5 weekly 1 Connection Frequency
3 0 8 3 weekly 1 Connection Frequency
4 0 6 5 weekly 1 Connection Frequency
5 0 2 4 weekly 1 Connection Frequency
6 0 14 5 weekly 1 Connection Frequency
With both 'from' and 'to' containing the same id's (e.g. from-to; 0-1 & 1-0). I'd like to condense so that only one iteration of the 0-1 relationship exists, with a summed Strength
calculated.
Here's my code thus far:
graph <- data %>%
filter(Dataframe == "weekly" & Question == 1) %>%
as_tbl_graph(directed = FALSE) %>%
activate(edges) %>% # first manipulate edges
filter(!edge_is_loop()) %>% # remove any loops
activate(nodes) %>% # now manipulate nodes
left_join(node.group, by = "name") %>%
mutate(
Popularity = centrality_degree(mode = 'in'),
Centre = node_is_center(),
Keyplayer = node_is_keyplayer(k = 5))
Is it possible to merge the two corresponding edges into a single edge? I've searched the forum but have only come across references where the same nodes are repeated in the same columns (i.e. 0-1's across multiple rows).
Upvotes: 4
Views: 2555
Reputation: 1474
library(tidygraph) # v1.2.0
library(dplyr) # v0.8.5
library(purrr) # v0.3.4
dat <- data.frame(
from = c("a", "a", "b", "c"),
to = c("b", "b", "a", "b"),
n = 1:4
)
Call to_simple()
within convert()
to collapse parallel edges. The corresponding edges and weights are stored in .orig_data
as a list of tibbles. Then, extract the sum of the weight of collapsed edges from .orig_data
.
dat %>%
as_tbl_graph() %>%
convert(to_simple) %>%
activate(edges) %>%
mutate(n_sum = map_dbl(.orig_data, ~ sum(.x$n)))
# A tbl_graph: 3 nodes and 3 edges
#
# A directed simple graph with 1 component
#
# Edge Data: 3 x 5 (active)
from to .tidygraph_edge_index .orig_data n_sum
<int> <int> <list> <list> <dbl>
1 1 2 <int [2]> <tibble [2 x 3]> 3
2 2 1 <int [1]> <tibble [1 x 3]> 3
3 3 2 <int [1]> <tibble [1 x 3]> 4
#
# Node Data: 3 x 2
name .tidygraph_node_index
<chr> <int>
1 a 1
2 b 2
3 c 3
Upvotes: 4
Reputation: 455
tidygraph
can simplify graphs when in a morph
ed state with the simplify_to
call, but it reverts to the original when unmorph
ing.
This is a tidy workaround:
data <- read.table(header=TRUE, text="
from to weight
0 14 5
0 1 1
1 0 1
")
original <- as_tbl_graph(data)
Input:
> original
# A tbl_graph: 3 nodes and 3 edges
#
# A directed simple graph with 1 component
#
# Node Data: 3 x 1 (active)
name
<chr>
1 0
2 1
3 14
#
# Edge Data: 3 x 3
from to weight
<int> <int> <int>
1 1 3 5
2 1 2 1
3 2 1 1
Solution:
modified <- original %>% activate(edges) %>%
# create a temporary grouping & filtering variable by sorting from/to IDs
mutate(temp = ifelse(from > to, paste0(to, from), paste0(from, to))) %>%
group_by(temp) %>%
mutate(weight = sum(weight)) %>%
ungroup() %>%
dplyr::distinct(temp, .keep_all = TRUE) %>%
select(-temp)
Output:
> modified
# A tbl_graph: 3 nodes and 2 edges
#
# A rooted tree
#
# Edge Data: 2 x 3 (active)
from to weight
<int> <int> <int>
1 1 3 5
2 1 2 2
#
# Node Data: 3 x 1
name
<chr>
1 0
2 1
3 14
Upvotes: 1
Reputation: 1954
Here's one approach. It uses tidygraph
, which uses igraph
under the hood.
library(tidygraph)
#>
#> Attaching package: 'tidygraph'
#> The following object is masked from 'package:stats':
#>
#> filter
library(igraph)
#>
#> Attaching package: 'igraph'
#> The following object is masked from 'package:tidygraph':
#>
#> groups
#> The following objects are masked from 'package:stats':
#>
#> decompose, spectrum
#> The following object is masked from 'package:base':
#>
#> union
library(ggraph)
#> Loading required package: ggplot2
library(tidyverse)
g <- tibble(from = sample(letters[1:5], 25, T),
to = sample(letters[1:5], 25, T)) %>%
as_tbl_graph()
ggraph(g)+
geom_edge_parallel(arrow = arrow(type = 'closed'),
start_cap = circle(7.5, 'mm'),
end_cap = circle(7.5, 'mm'))+
geom_node_label(aes(label = name))+
labs(title = 'Multiple edges shown between node pairs')
#> Using `stress` as default layout
# Add the weigths as counts in the original dataframe
g_weights <- g %>%
activate(edges) %>%
as_tibble() %>%
mutate(link = glue::glue('{from}_{to}')) %>%
add_count(link) %>%
distinct(link, n, .keep_all = T) %>%
select(from, to, n) %>%
as_tbl_graph()
ggraph(g_weights)+
geom_edge_parallel(arrow = arrow(type = 'closed'),
start_cap = circle(7.5, 'mm'),
end_cap = circle(7.5, 'mm'),
aes(width = n))+
geom_node_label(aes(label = name))+
labs(title = 'Single edges shown between node pairs',
subtitle = 'Weights used as edge width')+
scale_edge_width(range = c(.5, 2), name = 'Weight')
#> Using `stress` as default layout
Created on 2019-09-03 by the reprex package (v0.3.0)
Upvotes: 0
Reputation: 61
I'm struggling on this problem, too. My solution so far is to collapse pairs of each nodes then sum the weights up. Something like this:
require(dplyr)
require(tidyr)
pasteCols = function(x, y, sep = ":"){
stopifnot(length(x) == length(y))
return(lapply(1:length(x), function(i){paste0(sort(c(x[i], y[i])), collapse = ":")}) %>% unlist())
}
data = data %>%
mutate(col_pairs = pasteCols(from, to, sep = ":")) %>%
group_by(col_pairs) %>% summarise(sum_weight = sum(weight)) %>%
tidyr::separate(col = col_pairs, c("from", "to"), sep = ":")
Upvotes: 1
Reputation: 864
You can collapse multiple edges in a graph g by jumping to a weighted adjacency metrix and back into an igraph graph like this:
gg <- graph.adjacency(get.adjacency(g), mode="undirected", weighted=TRUE)
Now gg
will contain the edge-attribute $weight
corresponding to the number of edges that occurred between each vertex-pair in g
.
I'm not so familiar with tidygraph, but I made this pedagogical code to ease your path.
# A graph from sample data
sample_el <- cbind(c(1,1,1,2,2,2,3,3,3,4,4,5,5,6,6,6,7,7,7,7,8,8),
c(2,2,3,6,6,4,4,6,8,5,5,6,8,7,7,2,6,8,3,6,4,4))
g <- graph_from_edgelist(sample_el, directed=F)
# Always plot graphs with this same layout
mylaoyt <- layout_(g, as_star())
plot(g, layout = mylaoyt)
# Merge all duplicate edges using a weighted adjacency matric that
# is converted back to a graph
gg <- graph.adjacency(get.adjacency(g), mode="undirected", weighted=TRUE)
# function to return a weighted edgelist from a graph
get.weighted.edgelist <- function(graph){cbind(get.edgelist(gg), E(gg)$weight)}
# compare your two edge-lists. el has duplicates, wel is weighted
el <- get.edgelist(g)
wel<- get.weighted.edgelist(gg)
el
wel
# Plot the two graphs to see what el and wel would look like:
par(mfrow=c(1,2))
plot(g, layout=mylaoyt, vertex.label=NA, vertex.size=10)
plot(gg, layout=mylaoyt, vertex.label=NA, vertex.size=10, edge.width=E(gg)$weight * 3)
The output in el
and wel
is visualise like so:
Hope that you can carve out what you'll need.
Upvotes: 1