Kris_Lon
Kris_Lon

Reputation: 11

AWS Neptune - Betweenness Centrality computation

I'm struggling to get a betweenness centrality computation done on a graph in AWS Neptune using Gremlin. I am using the Gremlin recipe:

g.V().as("v").
           repeat(both().simplePath().as("v")).emit(). 
           filter(project("x","y","z").by(select(first, "v")).
                                       by(select(last, "v")).
                                       by(select(all, "v").count(local)).as("triple").
                  coalesce(select("x","y").as("a").
                             select("triples").unfold().as("t").
                             select("x","y").where(eq("a")).
                             select("t"),
                           store("triples")).
                  select("z").as("length").
                  select("triple").select("z").where(eq("length"))).
           select(all, "v").unfold(). 
           groupCount().next() 

this works fine on this toy graph, also from the recipe:

g.addV().property(id,'A').as('a').
           addV().property(id,'B').as('b').
           addV().property(id,'C').as('c').
           addV().property(id,'D').as('d').
           addV().property(id,'E').as('e').
           addV().property(id,'F').as('f').
           addE('next').from('a').to('b').
           addE('next').from('b').to('c').
           addE('next').from('b').to('d').
           addE('next').from('c').to('e').
           addE('next').from('d').to('e').
           addE('next').from('e').to('f').iterate()

but when using the Movielens100k dataset, as pre-loaded in Neptune, it does not compute in reasonable time. Presumably this is because the graph is too large and the betweenness centrality is computationally too expensive. The graph is loaded correctly in Neptune and I can do standard traversals.

And so my question: for larger graphs can Gremlin be used or do we need to rely on Networkx and other packages or is there a solution in Sparql? So my question is which is the most efficient method to compute centrality metrics in AWS Neptune? Thanks a lot!!

Upvotes: 1

Views: 389

Answers (1)

Shay
Shay

Reputation: 13

Gremlin is probably not the best solution for this, you can try to use the networkx or igraph python libraries for the graph algorithms

https://aws.amazon.com/about-aws/whats-new/2022/06/amazon-neptune-simplifies-graph-analytics-machine-learning-workflows-python-integration/

Upvotes: 1

Related Questions