Ryan
Ryan

Reputation: 1242

AWS Neptune Node counts timing out

We're running a large bulk load into AWS neptune and can no longer query the graph to get node counts without the query timing out. What options do we have to ensure we can audit the total counts in the graph?

Fails on curl and sagemaker notebook.

Upvotes: 3

Views: 676

Answers (1)

Kelvin Lawrence
Kelvin Lawrence

Reputation: 14371

There are a few of things you could consider.

  1. The easiest is to just increase the timeout specified in the cluster and/or instance parameter group, so that the query can (hopefully) complete.
  2. If your Neptune engine version is 1.0.5.x then you can use the DFE engine to improve Gremlin count performance. You just need to enable the DFE engine using DFEQueryEngine=viaQueryHint in the cluster parameter group.
  3. If you get the status of the load it will show you a value for the number of records processed so far. In this context a record is not a row from a CSV file or RDF format file. Instead it is the count of triples loaded in the RDF case and the count of property values and labels in the property graph case. As a simple example, imagine a CSV file with 100 rows and each row has 6 columns. Not including the ID column that is a label and 4 properties. The total number of records to load will be 100*5 i.e 500. If you have sparse rows then the calculation will be approximate unless you add up every non ID column with an actual value.
  4. If you have the Neptune streams feature enabled you can inspect the stream and find the last vertex or edge created. Note that just enabling streams for this purpose may not be the ideal choice as it will impact the speed of the load as adding to the stream adds some overhead.

Updated 2023-03-20

As of Engine release 1.2.1.0, Amazon Neptune now provides a Summary API that can be used to query various graph metadata, including node and edge counts etc. The Graph-Notebook project now also offers a %summary line magic that can be used to invoke the API.

Upvotes: 2

Related Questions