Reputation: 1
I imported my data with joern, but when I wanted to produce pdg diagrams, the query speed became slower and slower with the for loop.The code used for query is as follows. I used cprofile to analyze and found that there was a problem with this function.
def getUSENodesVar(db, func_id):
query = "g.v(%s).out('USE').code" % func_id
ret = db.runGremlinQuery(query)
if ret == []:
return False
else:
return ret
I hope to improve the speed of inquiry.
Upvotes: 0
Views: 49
Reputation: 67019
It is inefficient to make a separate Gremlin query per func_id
for multiple func_id
s.
Instead, your function should take a list of func_id
s and make a single Gremlin query that returns a collection of distinct code
values. For example:
def getUSENodesVar(db, func_ids):
func_ids_str = str(func_ids).replace('[', '').replace(']', '')
query = "g.V().hasId(within(func_ids_str)).out('USE').values('code').dedup()"
return db.runGremlinQuery(query)
Since joern
's runGremlinQuery
only takes a query argument (and does not also take a parameters argument), this function converts the input list (func_ids
) into a string (runGremlinQuery
) that the Gremlin API will understand to be a list when there are multiple ids.
Of course, the client of getUSENodesVar
will also have to be changed to pass it a list of ids, and to handle the returned list of codes.
Upvotes: 0