Reputation: 1414
I am looking for the shortest and longest paths to connect two labels. I am experimenting with the following Gremlin Python code.
from gremlin_python.driver import client
from gremlin_python.process.traversal import __, T, P, Vertex
# Initialize the Gremlin client
client = client.Client('ws://localhost:8182/gremlin', 'g')
try:
# Query to find paths from "City" to "Person"
paths = client.submitAsync(
g.V()
.hasLabel("City")
.repeat(__.bothE().otherV().simplePath())
.until(__.hasLabel("Person"))
.limit(100)
.path()
.toList()
).result()
for path in paths:
relation_types = [v.label for v in path.objects if isinstance(v, Vertex)]
print(relation_types)
data.append(len(relation_types))
if data:
print(f"Shortest Path length = {min(data) - 1}")
print(f"Longest Path length = {max(data) - 1}")
else:
print(f"Shortest Path length = 0")
print(f"Longest Path length = 0")
except Exception as e:
print(f"An error occurred: {e}")
finally:
client.close()
We are currently observing that with a limit of 100, the shortest path length is 2 and the longest path length is 3. When we increase the limit to 1000, the longest path length rises to 5. However, if we remove the limit entirely, it causes the connection to crash and results in an "evaluation timeout" error.
Is there a specific function in TinkerPop that I may be overlooking for finding the shortest and longest paths? Alternatively, is there an algorithm that I could use to achieve this?
Upvotes: 0
Views: 26
Reputation: 1566
Finding shortest and longest paths requires a full search of the graph. Depending on the size of your graph this can take a lot of time, see here for a scalability example of the related problem of connected components.
If your Tinkerpop server supports it, you can also try the OLAP shortest_path
step, which scales slightly better than the OLTP traversal from your code.
Upvotes: 0