net_j
net_j

Reputation: 205

How to measure neo4j database size for community edition?

I understand that the Neo4j Community edition does not provide a dump. What is the best way to measure the size of the underlying Neo4j database? Will doing du provide a good estimate?

Thank you.

Upvotes: 5

Views: 8090

Answers (3)

Robert Brisita
Robert Brisita

Reputation: 5844

Some ways to view the database size in terms of bytes:

du -hc $NEO4J_HOME/data/databases/graph.db/*store.db*

From the dashboard:

http://localhost:7474/webadmin/

And view the 'database disk usage' indicator.

'Server Info' from the dashboard:

http://localhost:7474/webadmin/#/info/org.neo4j/Store%20file%20sizes/

And view the 'TotalStoreSize' row.

Finally, the database drawer; scroll all the way down and view the 'Database' section.

enter image description here

Upvotes: 3

cybersam
cybersam

Reputation: 66999

You can use Cypher to get the number of nodes:

MATCH ()
RETURN COUNT(*) AS node_count;

and the number of relationships:

MATCH ()-->()
RETURN COUNT(*) AS rel_count;

Upvotes: 1

Nicholas
Nicholas

Reputation: 7501

What do you mean:

the Neo4j Community edition does not provide a dump

The Enterprise edition doesn't provide anything like this either. Are you looking for statistics on the size of the DB in terms of raw disk or a count of Nodes/Relationships?

If Disk: Use du -sh on linux, or check the folder on Windows.

If Node/Relationship: You'll have to write Java code to actually evaluate the true size, as the count on the Web Console is not always true. You could also do a basic count by taking the pure size on disk and dividing by 9 for the node store, and 33 for the relationship store.

Java code would look like this:

    long relationshipCounter = 0;
    long nodeCounter = 0;
    GlobalGraphOperations ggo = GlobalGraphOperations.at(db);
    for (Node n : ggo.getAllNodes()) {
                    nodeCounter++;
        try {
            for (Relationship relationship : n.getRelationships()) {
                relationshipCounter++;
            }
        } catch (Exception e) {
            logger.error("Error with node: {}", n, e);
        }
    }
            System.out.println("Number of Relationships: " + relationshipCounter);
            System.out.println("Number of Nodes: " + nodeCounter);

The reason the Web Console isn't always true is it checks a file for the highest value, and Neo4j uses a delete marker for nodes, so there could be a range of "deleted" nodes that buff up the number of total nodes that are available. Eventually neo4j will compact and remove these nodes, but they don't do it in real time.

The reason why the file size may lie is the same as above. The only true way is to go through all nodes and relationships to check for the "isActive" marker.

Upvotes: 3

Related Questions