Georg Heiler
Georg Heiler

Reputation: 17676

igraph compute metrics for each node and its network

I am curious how to compute some metrics for each node.

For each node compute percentage of fraudulent connections for

Getting started with igraph I am not sure how to move forward to writing own graph processing functions (i.e. not only applying degree, pagerank, ...). Looking forward to some suggestions to solve this task with only one pass over the graph.

Minimal sample is here

library(igraph)
id = c("a", "b", "c", "d", "e", "f", "g") 
name = c("Alice", "Bob", "Charlie", "David", "Esther", "Fanny", "Gaby") 
fraud = c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE) 
verticeData <- data.frame(id, name, fraud) 
verticeData

src <- c("a", "b", "c", "f", "e", "e", "d", "a")
dst <- c("b", "c", "b", "c", "f", "d", "a", "e")
relationship <-c("A", "B", "B", "B", "B", "A", "A", "A")
edgeData <- data.frame(src, dst, relationship)
edgeData
g <- graph_from_data_frame(edgeData, directed = TRUE, vertices = verticeData)
plot(g, vertex.color=V(g)$fraud)
# TODO compute metrics

I do not have privileges to move, so will do manually based on comment from https://stats.stackexchange.com/questions/256859/igraph-compute-metrics-for-each-node-and-its-network

Upvotes: 2

Views: 702

Answers (1)

paqmo
paqmo

Reputation: 3729

The gapply function from the sna package gives a lot of flexibility to calculate various ego network statistics. It functions more or less like the apply family of functions, but specifically loops over network neighborhoods. The intergraph package makes it easy to convert between igraph and sna.

library(sna)
net<-intergraph::asNetwork(g)
c <- c(1,2)
funcs <- c(sum,mean)
for (i in funcs){
  for (j in list(1,2,c)){
    print(gapply(net,j,net %v% "fraud",i)) 
  }
}

gapply in not super straight forward to use. The second argument ("MARGIN") indicates either row-wise (outgoing ties), column-wise (incoming ties), or both (i.e., undirected). The third argument is a vector of statistics to calculate, and the fourth argument is the function you want to use. As you can, there is a lot of flexibility in the third and fourth arguments.

> gapply(net,c(1,2),net %v% "fraud",sum)
[1] 0 1 0 1 1 0 0
> gapply(net,c(1),net %v% "fraud",sum)
  Alice     Bob Charlie   David  Esther   Fanny    Gaby 
      0       0       0       1       0       0       0 
> gapply(net,c(2),net %v% "fraud",sum)
  Alice     Bob Charlie   David  Esther   Fanny    Gaby 
      0       1       0       0       1       0       0 

Upvotes: 4

Related Questions