Reputation: 569
Looking at Neo4j, and the 32 billion relationship limit has me worried (imagine 40 million users who upload 500 photos, have 500 friends, make 500 comments etc and before you know it you are past 32 billion).. So I have some concerns and have to make sure I'm making the best choice on which database to use.
Not looking for subjective answers nor debate here - ie. which one is better etc - rather, since I'm betting a startup's future on what graph database is uses, I need to know the risks the different databases present, such as Neo4j not having more than 32billion relationships.
Now, several companies have called their graph databases the "leading graph database".. but let's look past the hype -which one has the most financial backing? Which db enjoys a large community support? Which one has a solid company behind it for commercial support?
Which one is most likely to be mature enough so if you wanted, you could easily create facebook with minimal effort?
It's easy to choose a graph database on technical features or familiarity - but I'm looking for more than that - I want to make sure a few years from the company is still around. I want to make sure I'm not choosing to go with Neo4j based on hype and the momentum it currently (temporarily?) has...
And What other graphs can contend with Neo4gj to create a full fledged social network similar to facebook (again, not looking for better, just looking for a solid competitor ).
Please don't let this turn into a subjective Neo vs Dex debate - just facts and solids answers please..
Upvotes: 35
Views: 11104
Reputation: 35665
What you've asked for, and what you should be focusing on are two different things.
Although the following does not answer your question, I hope it helps you and other developers consider what's really at play here:
Upvotes: 0
Reputation: 341
To add to the great responses, you also need to consider licensing. If your project is completely open source that fulfills the GPLv3 requirements, then something like neo4j is a great way to go. However, if you are using it in a proprietary system, you will need to purchase a neo4j enterprise license or use another database with fewer licensing restrictions (MIT or Apache 2 licenses) like Titan.
This is a great resource to review licenses: http://en.wikipedia.org/wiki/Graph_database
Upvotes: 1
Reputation: 187
My advice is to build your application on standard APIs like Blueprints. The main Blueprints page lists various implementations available. This way, you won't be locked in and can pick the best implementation based your needs (size, speed, price) and the state of the market at that point in time.
Upvotes: 10
Reputation: 1555
Michael beat me to the punch, but let me add, answering for Neo4j, and letting others respond about other technologies.
The link below includes a variety of facts about the state of the Neo4j community, product adoption, and the company behind the product:
http://blog.neo4j.org/2013/01/2012-year-in-review-happy-2013-it-looks.html
The link below speaks to this year's roadmap, which among other things will lift the current size limit. The limit is simply a space-performance optimization that was chosen back when price-performance ratios were a little different. We'll do the work this year to increase a few pointer sizes, and release a version with no practical upper limit in the next several months:
http://blog.neo4j.org/2013/01/2013-whats-coming-next-in-neo4j.html
There are production installations with half the Facebook social graph in a Neo4j cluster, on the back of highly active web sites. The only cross-region Amazon database cluster that I'm aware of (for any database management system) is one that's running on Neo4j: 10 instances spread between the U.S., Asia, and Europe. One of the world's largest parcel delivery services does all of its package routing using Neo4j, routing 2000+ packages per second at peak. Decisions are made in real time literally as packages slide down a chute. They went live last Fall and Christmas was able to happen for tens of millions of people. Lots more. This is a sampling.
Welcome to the awesome world of graphs! Whatever solution you end up choosing, we're glad to have you as part of the graph database community.
Philip
Upvotes: 10
Reputation: 41706
Disclaimer: I work for/with Neo4j
Just talking about the maturity here (not technicalities) - Neo Technology as a company with more than 50 employees, $25M funding and a thriving user-base with half a million downloads, 30k new databases running each month and an active community won't go away. You can also check the SO questions to see the community activity.
We have a healthy set of customers in many domains from big ones like Adobe (runs creative cloud on Neo4j), Cisco (Org-Management, MDM), social networks like Viadeo and many Job search companies (GlassDoor, and others) to startups like fiftythree who published the popular "Paper" app on iOS.
Our community site neo4j.org should be a good place to go, to get started, you find there introductory content as well as information on programming languages, drivers and deployments that should help you get started.
Emil, Ian and Jim wrote an introductory book about "graph databases" with O'Reilly which is currently available as a free ebook download.
So you see we're not just taking care about our own product but also the bigger graph ecosystem, also with many conference talks, meetup groups (41 worldwide) and support of the open source ecosystem.
Hope that helps you deciding.
P.S. Regarding your concerns: The size limits (which are artificially anyway) will be increased this year.
Upvotes: 17
Reputation: 2312
We're working with Neo4j since 2010 and betting not only our company on it, but have invested quite a lot of time into an open source project as well (http://www.ohloh.net/p/structr). There's a blog post from Feb 2012 where you can read the details:
http://structr.org/blog/the-story-behind-structr
Admittedly, our company is quite small. But we've done, and are doing, about a dozen of projects with Neo4j, and are really happy with the outcome.
The community behind Neo4j is vibrant, open, and always very supportive. You should go to one of the meetup events to get an idea. :-)
Like Richard said, the financial facts are out of question. What I find most impressive, is that the folks at Neo Technology, despite being a commercial company which has to generate revenue, are really enthusiasts who know and love what they do, and are really committed to the Open Source model.
So yes, I'm biased, but not without reason. :-)
Upvotes: 6
Reputation: 8202
So I've tested and been working with graph databases for the last year. I think only you know your data well enough to be able to make an educated guess as to whether you're going to have any nodes needing more than 32 billion relationships. I would argue there are not a lot of use cases right for most people where this is a limitation. But that's not absolute.
Neo4j is a brilliant product. Well documented and with folks like maxdemarzi writing excellent blog posts - such as: http://maxdemarzi.com/ - which will bring anyone up to speed on the power and sophistication of neo4j pretty quickly. (Plus he's a nice guy who'll answer your questions if you have them)
If scale is an issue I'd also recommend you take a look at Titan - http://thinkaurelius.github.com/titan/. The guys behind this are brilliant and it's intended for massive scale. It's not as established in the market as neo4j but it has a lot of power and gives you some flexibility on priorities by letting you select between Cassandra, Hbase and BerkeleyDB for underlying storage.
Neo4j is a well backed, well funded company with real revenues. It isn't going anywhere. Titan is smaller but I think is on a rapid upward curve.
The truth is though it's all a new space. You're not getting anything as established as Postgres, MySql or the corporate strength of Oracle. Let's not kid ourselves.
However the graph database community is relatively small, friendly and helpful. It runs great events - I was at Neo4j's GraphCon event which was awesome, and I've been to some talks by the Titan guys which were great. Ultimately if you want to be Facebook though, whatever you start with you'll end up building your own infrastructure. There's scale and then there's you-need-to-own-datacenters-the-size-of-small-countries scale.
One final thought. The problem of 40 million users and your underlying infrastructure challenges is a problem for a well established well funded company. You don't get to 40 million users and not attract the funding or generate the revenue necessary to finance building out your own infrastructure. You can plan now for when you're 40 million users, absolutely. Go for it. That's the fun of early stages in a startup. But your bigger problem is getting to your first million or ten million even. For that use whichever of these databases gets you to market fastest with a solid product.
Upvotes: 15