Jay
Jay

Reputation: 359

Neo4j representation of graph - internals

I have a question regarding how a graph in Neo4j is loaded into memory from disk.

Reading the link here, I think I understand how the graph is represented on disk. And when a new Neo4j databases is created, there are physically separate files created for Nodes, Edges and Property stores (mainly).

When you issue a query to Neo4j, does it:

1) Load the entire graph(nodes, edges, properties) in memory using a doubly link list structure?

OR

2) Determine the nodes, edges required for the query and populate the list structure with random accessess to the relavant stores(nodes, edges) on disk? If so, how does Neo4j minimize the number of disk-accesses?

Upvotes: 1

Views: 184

Answers (1)

Stefan Armbruster
Stefan Armbruster

Reputation: 39905

As frobberOfBits mentions it's more like #2. The disc accesses are minimized by a two-layered cache architecture which is best described in the reference manual. Even if your cache is smaller than the store files this results mostly in seek operations (since a fixed record length) with a read. This kind of operations are typically fast (even faster with appropriate hardware like SSD)

Upvotes: 2

Related Questions