Reputation: 205
I understand that the Neo4j Community edition does not provide a dump. What is the best way to measure the size of the underlying Neo4j database? Will doing du provide a good estimate?
Thank you.
Upvotes: 5
Views: 8090
Reputation: 5844
Some ways to view the database size in terms of bytes:
du -hc $NEO4J_HOME/data/databases/graph.db/*store.db*
From the dashboard:
http://localhost:7474/webadmin/
And view the 'database disk usage' indicator.
'Server Info' from the dashboard:
http://localhost:7474/webadmin/#/info/org.neo4j/Store%20file%20sizes/
And view the 'TotalStoreSize' row.
Finally, the database drawer; scroll all the way down and view the 'Database' section.
Upvotes: 3
Reputation: 66999
You can use Cypher to get the number of nodes:
MATCH ()
RETURN COUNT(*) AS node_count;
and the number of relationships:
MATCH ()-->()
RETURN COUNT(*) AS rel_count;
Upvotes: 1
Reputation: 7501
What do you mean:
the Neo4j Community edition does not provide a dump
The Enterprise edition doesn't provide anything like this either. Are you looking for statistics on the size of the DB in terms of raw disk or a count of Nodes/Relationships?
If Disk: Use du -sh
on linux, or check the folder on Windows.
If Node/Relationship: You'll have to write Java code to actually evaluate the true size, as the count on the Web Console is not always true. You could also do a basic count by taking the pure size on disk and dividing by 9 for the node store, and 33 for the relationship store.
Java code would look like this:
long relationshipCounter = 0;
long nodeCounter = 0;
GlobalGraphOperations ggo = GlobalGraphOperations.at(db);
for (Node n : ggo.getAllNodes()) {
nodeCounter++;
try {
for (Relationship relationship : n.getRelationships()) {
relationshipCounter++;
}
} catch (Exception e) {
logger.error("Error with node: {}", n, e);
}
}
System.out.println("Number of Relationships: " + relationshipCounter);
System.out.println("Number of Nodes: " + nodeCounter);
The reason the Web Console isn't always true is it checks a file for the highest value, and Neo4j uses a delete marker for nodes, so there could be a range of "deleted" nodes that buff up the number of total nodes that are available. Eventually neo4j will compact and remove these nodes, but they don't do it in real time.
The reason why the file size may lie is the same as above. The only true way is to go through all nodes and relationships to check for the "isActive" marker.
Upvotes: 3