Reputation: 2269
I was reading a book recommended on Neo4j site: http://neo4j.com/books/graph-databases/ about graph database performance and it said:
"In contrast to relational databases, where join-intensive query performance deteriorates as the dataset gets bigger, with a graph database performance tends to remain relatively constant, even as the dataset grows. This is because queries are localized to a portion of the graph. As a result, the execution time for each query is proportional only to the size of the part of the graph traversed to satisfy that query, rather than the size of the overall graph."
So e.g. I want to return only nodes with a label "Doctor, that's localized to a portion of a graph. But my question is how does the database itself know where those nodes are ? In other words, does it not need to traverse all nodes to find out whether or not they satisfy the query and make decision based on that ?
Upvotes: 1
Views: 330
Reputation: 41676
In general localized searches mean: you start from a smallish set of starting points which can be people, products, places, orders etc.
A portion of the graph that is annotated with a label, often doesn't fall into that category, i.e. all doctors are not a smallish set of starting points.
Your query would probably touch a large portion of the graph if you traverse out from all doctors to their neighborhoods.
A query like this would be a graph local one:
MATCH (:City {name:"SFO"})<-[:RESIDES_IN]-(d:Doctor)-[presc:PRESCRIBES]->(m:Medicine)
RETURN d.name, m.name, sum(presc.amount) as amount
Upvotes: 1
Reputation: 10856
Neo4j has a special indexing for node labels so that it can find all nodes for a label without searching all nodes. Beyond that you can:
Upvotes: 1