knapply
knapply

Reputation: 667

Clarification on igraph::count_multiple()

I expect igraph::count_multiple() to count edge multiplicity, which the documentation seems to reflect. However, it doesn't always result in whole numbers.

An example:

library(igraph)
library(dplyr)

data("USairports", package = "igraphdata")

Expectation: Counting edges, while grouping on relevant vertices...

(
manualish_count <- USairports %>% 
  igraph::as_data_frame() %>% 
  add_count(from, to)
) %>% 
  select(from, to, n)

#> # A tibble: 23,473 x 3
#>    from  to        n
#>    <chr> <chr> <int>
#>  1 BGR   JFK       2
#>  2 BGR   JFK       2
#>  3 BOS   EWR      10
#>  4 ANC   JFK       1
#>  5 JFK   ANC       1
#>  6 LAS   LAX      20
#>  7 MIA   JFK      10
#>  8 EWR   ANC       1
#>  9 BJC   MIA       1
#> 10 MIA   BJC       1
#> # ... with 23,463 more rows

... results in whole numbers; manualish_count$n contains <int>egers.

Using igraph, everything seems fine at first glance...

(ig_count <- count_multiple(USairports)) %>% head(10)
#>  [1]  2  2 10  1  1 20 10  1  1  1

... but there are actually fractions:

ig_count[ig_count %% 1 != 0]
#>  [1] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1.5 1.5 1.5 1.5 0.5 1.5 1.5 0.5 0.5 0.5
#> [18] 0.5 1.5 1.5 1.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
#> [35] 0.5

Am I misunderstanding count_multiple()'s purpose or using it incorrectly?

igraph_version()
#> [1] "1.2.2"

Upvotes: 3

Views: 177

Answers (1)

Esther
Esther

Reputation: 1115

It's because of how loops (self edges) are handled with count_multiple.

ig_count <- count_multiple(USairports)
x <- which(ig_count %% 1 != 0)

E(USairports)[x]

#[1] HOM->HOM FYU->FYU OME->OME ANI->ANI KLL->KLL WFB->WFB RIC->RIC DEN->DEN
#[9] BLD->BLD BLD->BLD BLD->BLD DCA->DCA DEN->DEN DEN->DEN MCI->MCI SSB->SSB
#[17] MIA->MIA KEH->KEH LKE->LKE LKE->LKE LKE->LKE LPS->LPS VGT->VGT DET->DET
#[25] CID->CID CLE->CLE JFK->JFK LGA->LGA MKE->MKE ORD->ORD PHL->PHL GRR->GRR
#[33] MEM->MEM JNU->JNU MSP->MSP

The underlying c routine igraph_count_multiple explicitly divides the edge count by 2 for loops.

 /* for loop edges, divide the result by two */
    if (to == from) VECTOR(*res)[i] /= 2;

You can avoid this by counting multiples only over non-loops

ig_count2 <- count_multiple(simplify(USairports, remove.multiple = FALSE, remove.loops=TRUE)) 

ig_count2[1:10]
#[1]  2  2 10  1  1 20 10  1  1  1

which(ig_count2 %% 1 != 0)
#integer(0)

Upvotes: 2

Related Questions