ABuist
ABuist

Reputation: 31

Possibility of speeding up transitive closure list updater

I have created the following function:

update_list3 = function(identity_dict){
    suppressMessages(suppressWarnings({
        library(ggm)
        #adjacency matrix
        g = matrix(unlist(lapply(identity_dict, FUN = function(x){
            b = rep(0, length(identity_dict))
            b[x] = 1
            b})),
            nrow = length(identity_dict), byrow = T)
        #map transitive closure
        closure = transClos(g)
        #turn back to adjacency list
        out_list = lapply(as.list(data.frame(t(closure))), FUN = function(x){which(x == 1)})
        #tranclos removes self-connection, so bind self back to own identity
        out_list = mapply(c, as.list(c(1:length(out_list))), out_list, SIMPLIFY = F)
        lapply(out_list, FUN = sort)
    }))
}

To update a list, identity_dict, of the form:

1: {1,3,4}
2: {2,5}
3: {3,4}
4: {4}
5: {5}

to provide the transitive closure of the undirected graph, with the form:

1: {1,3,4}
2: {2,5}
3: {1,3,4}
4: {1,3,4}
5: {2,5}

This works for small problems, but larger problems begin to stall quite quickly. Would there be a possibility of performing the same calculation by different means - perhaps the matrix transformation is limiting due to size?

An ideal solution would not use iGraph, as we have had problems with this package returning errors for large problems.

Inputs work as intended, but speed rapidly decreases with large problems.

Does not scale to larger problems in good time.

Upvotes: 1

Views: 73

Answers (1)

ThomasIsCoding
ThomasIsCoding

Reputation: 102549

  • If you would like to avoid using igraph, you can use the following code
dict <- identity_dict
out <- list()
repeat {
    if (length(dict) == 0) {
        break
    }
    d <- as.character(dict[[1]])
    repeat {
        d2 <- dict[as.character(unique(unlist(d)))]
        v <- as.character(unique(unlist(d2)))
        if (length(setdiff(v, d)) == 0) {
            dict <- dict[!names(dict) %in% v]
            out[v] <- rep(list(as.integer(v)), length(v))
            break
        } else {
            d <- v
        }
    }
}
identity_dict <- out[names(identity_dict)]

which finally gives

> identity_dict
$`1`
[1] 1 3 4

$`2`
[1] 2 5

$`3`
[1] 1 3 4

$`4`
[1] 1 3 4

$`5`
[1] 2 5

  • If igraph works for you (I have no clue what kind of errors you have encountered with igraph), I think components and membership is the thing you may need, e.g.,
# group info
grp <- stack(identity_dict) %>%
    graph_from_data_frame() %>%
    components() %>%
    membership()

# update `identity_dict`
identity_dict[] <- ave(
    as.integer(names(grp)),
    grp,
    FUN = list
)[match(names(identity_dict), names(grp))]

such that

> identity_dict
$`1`
[1] 1 3 4

$`2`
[1] 2 5

$`3`
[1] 1 3 4

$`4`
[1] 1 3 4

$`5`
[1] 2 5

Data

identity_dict <- list(
    `1` = c(1, 3, 4),
    `2` = c(2, 5),
    `3` = c(3, 4),
    `4` = 4,
    `5` = 5
)

and

> identity_dict
$`1`
[1] 1 3 4

$`2`
[1] 2 5

$`3`
[1] 1 3 4

$`4`
[1] 1 3 4

$`5`
[1] 2 5

Upvotes: 1

Related Questions