Reputation: 219
Google has described a novel framework for distributed processing on Massive Graphs.
http://portal.acm.org/citation.cfm?id=1582716.1582723
I wanted to know if similar to Hadoop (Map-Reduce) are there any open source implementations of this framework?
I am actually in process of writing a Pseudo distributed one using python and multiprocessing module and thus wanted to know if someone else has also tried implementing it. Since public information about this framework is extremely scarce. (A link above and a blog post at Google Research)
Upvotes: 21
Views: 5075
Reputation: 1372
Upvotes: 15
Reputation: 4814
Yes, a new project called Golden Orb, which is an open-source Pregel implementation written in Java that runs on both HBASE and Cassandra.
It has been submitted to Apache incubator for approval, and Ravel, the company behind Golden Orb, said they are releasing it this month (http://www.raveldata.com/goldenorb/).
See http://www.quora.com/Graph-Databases/What-open-source-graph-databases-support-horizontal-scaling
UPDATE: GraphX is GraphLab2 on Spark implemented by Joey Gonzalez, the creator of GraphLab2.
Spark's unique primitives make GraphX-Pregel the fastest JVM-based Pregel implementation. Spark is written in Scala, but Spark has a Java and Python API.
See...
P.S. There is also Bagel, which was the first cut at Pregel on Spark. It works; however, GraphX will be the way forward.
Upvotes: 3
Reputation: 1
Stanford Students have developed an open Source implementation of Pregel. http://infolab.stanford.edu/gps/
Upvotes: 0
Reputation: 497
There is also Signal/Collect a framework written in Scala and now using Akka http://code.google.com/p/signal-collect/
https://github.com/uzh/signal-collect
In Signal/Collect an algorithm is written from the perspective of vertices and edges. Once a graph has been specified the edges will signal and the vertices will collect. When an edge signals it computes a message based on the state of its source vertex. This message is then sent along the edge to the target vertex of the edge. When a vertex collects it uses the received messages to update its state. These operations happen in parallel all over the graph until all messages have been collected and all vertex states have converged.
Many algorithms have very simple and elegant implementations in Signal/Collect. You find more information about the programming model and features in the project wiki. Please take the time to explore some of the example algorithms below.
Upvotes: 2
Reputation: 1120
Two projects from Carnegie Mellon University provide Pregel-style computation on graphs:
The programming model is not exactly same as Pregel, as they are not based on messaging but on modifying the graph (edge, vertex) data directly. Basically, it is easy to emulate Pregel in these framework.
Upvotes: 2
Reputation: 13927
The main Hadoop project for distributed graph processing is the Hama project. Its still in incubation though.
The project has broken its work into two areas; a matrix package and a graph package.
Update:
A better option would be the Apache Giraph project which is based on Google Pregel.
Upvotes: 4
Reputation: 2294
Apache Giraph is currently in Incubator and under very active development, with committers from LinkedIn, Twitter, Facebook and academia looking to bring it up to production scale very quickly. It is pretty directly modeled on Pregel and was originally developed at Yahoo! Research. We're looking for new contributors and have several introductory JIRA issues to help people get started with the project. We'd love to have you get involved.
Upvotes: 1
Reputation: 2925
I create a framework called Phoebus. It is an implementation of Pregel written in Erlang. Checkout my blog entry for applying Pregel model to path finding as well..
Upvotes: 1