Morinaga
Morinaga

Reputation: 35

Estimating the size of a AWS Neptune graph database

I am currently building a graph using AWS Neptune. Is there a way of determining or calculating the size of a filled database with AWS Neptune?

Upvotes: 1

Views: 1816

Answers (3)

The-Big-K
The-Big-K

Reputation: 2820

There is an answer already in this post, but posting one more with a bit more details, as the previous answer does not mention if the storage includes space used by replication, deleted data etc.

As @Morinaga already pointed out, Cloudwatch exposes the amount of bytes used by actual datapages under AWS/Neptune -> By Cluster -> VolumeBytesUsed. This shows the exact storage that you get charged for. Internally Neptune uses a distributed storage for the data, which includes multiple copies, some additional storage for metadata etc. None of that info impacts how you get billed, so they are not included in VolumeBytesUsed.

Neptune also supports copy-on-write, where you can create a cloned volume from another cluster. One thing to note with cloned volumes is that the new cluster only takes us space for pages that have diverged from the source. So when you plot the VolumeBytesUsed metric for a clone, you would see a much smaller number for the clone as long as the source cluster is still active and lying around. If you delete the source cluster, the space is then re-adjusted in the clones. Do make a note of this, to avoid any possible confusion later on.

Last thing to note is that Neptune, as of Sept 2020, does not do volume shrinking. The VolumeBytesUsed is pretty much a high watermark of how much data pages were used, and deleting a lot of data just clears the data in the data pages, it does not remove it from the volume. So if you create a cluster, add a bunch of data and them delete everything, your VolumeBytesUsed would still show the high watermark. When you insert new data, we would reuse the available data pages first, so you don't end up paying for new data pages.

Upvotes: 3

Morinaga
Morinaga

Reputation: 35

AWS Cloud Watch can be used to figure out the exact size of your filled database.

Under Metrics you can select Neptune and search for the MetricName='VolumeBytesUsed'. This will show you the amount of data that has been uploaded to your database.

Upvotes: 2

Alex P.
Alex P.

Reputation: 91

It really depends on how much data you store in vertex and edge properties. Taylor answer here explains more as storage capacity is dynamically allocated in Amazon Neptune.

Upvotes: 0

Related Questions