Reputation: 293
I'm using Neo4j to create a network database which consists of :
-Taxi company (1 node) that includes
-cities (1000 node) which contains
-taxis (100 node/per city -total 100,000-).
-Each taxi has a 'fee' to be calculated twice a day so that's 2 nodes per day for each taxi (a taxi node has a relationship with its fee as this relationship property is a date; because when I want to retrieve a collection of fees, I'll match them by date)
consequently each city will have two 'total fees' of the total amount of taxis earned per city to calculate the difference.
I need to do this for 6 months (2 nodes -> for each taxi -> 200,000/day -> 360,000,000 fee node) and that's A LOT of nodes and a lot of disk space to be stored on an HDD so my question :
is there a way to optimize disk storage of such large dataset ? or there a way to compress it?
Upvotes: 0
Views: 720
Reputation: 45043
You can use Neo4j Hardware Sizing Calculator to estimate how much space you will need to store all those data.
In these days it doesn't make sense to optimize space on disk, because disk are so cheap.
Neo4j already does some kind of compression for you - http://neo4j.com/docs/stable/property-compression.html
Also you can use filesystem compression, but it will have huge impact on the performance.
Upvotes: 2