Kevin Cohen
Kevin Cohen

Reputation: 1351

Handle Neo4j Injection for dynamic labels and relationships

I have been usingWithParams with neo4j client for c# but withParams don't work for label or relationship types for example.

The alternative I thought of the moment was to concatenate my string that I want to form based on some label as parameter and then construct the cypher query. That is:

string optionalMatchString = $"p =(n1)-[{relationshipsString}]-(n2)";
graphClient.Cypher.Match("(n1)")
           .Where((Node n1) => n1.Identifier == identifier)
           .OptionalMatch(optionalMatchString)

As you can guess, relationshipsString is a parameter passed to me. If I use WithParams the query will not substitute the parameters so for now I concatenate the string but this is vulnearble to attacks ... (yes?)

I learned about APOC and saw this issue

This is an example I saw this:

CALL db.labels() yield label
call apoc.cypher.run("MATCH (n:`"+label+"`) RETURN keys(n) as keys LIMIT 1",{}) yield value as row
RETURN label, row.keys as keys

Apparently, there is an APOC procedure called cypher.run in which I can put my labels (or relationships for that matter) as variables (from the parameters) but from what I see, they are just concatenating a string... so is that the same as what I have been doing? or does APOC somehow perform other stuff on top of the query? Would that APOC procedure be "safe" against injections?

Upvotes: 0

Views: 1069

Answers (2)

Charlotte Skardon
Charlotte Skardon

Reputation: 6280

EDIT

There is the risk of the db being wiped (see the comments below from @Gabor) but I would still go down the route of an Enum to solve this. However, Gabor's answer is better from that pov.


APOC is doing the same thing you are doing, so you won't get any benefits from using it. I think the key thing here is to look at the level of protection you need.

Taking your query, (I've made a guess at the RETURN bit) which boils down to:

MATCH (n) WHERE n.Id = {p0} OPTIONAL MATCH (n)-[:{injectionPoint}]->(n1) RETURN n, n1

Let's look at what we could get from this, assuming the attacker knows the following information:

  • The format you are returning (i.e. the info you get from n and n1)
  • How your query is structured, i.e. know that you are returning a node called n1

They could try adding another query, i.e. injecting something like:

"ORIGINAL]->(n1) RETURN n1; MATCH (n) RETURN labels(n);"

But that won't work for two reasons, one the Cypher endpoint you're calling will only accept one query at a time, and two, the response from MATCH (n) RETURN labels(n) won't match the response you're currently getting and so will error.

The biggest risk you have is with someone being able work out your model by passing in different relationship types until they get responses - this would take ages though, so probably not such an issue.

The attacker does have to know the info above to get this out though, without that, they'll just be stabbing in the dark. And if they do have that info - then they probably already know what they could get from your DB.

If you took what you're doing and put it into a SQL context, you'd be asking for something from any table, with an ID, and then by providing a table name, looking for anything that connects to. Would you do that in SQL? Also - with no labels, you are scanning the entire database for this info, that's going to be super slow.

I think you'd do well to add labels, at least to one of the nodes, if not both, and use something like an Enum to represent your allowed relationships - I appreciate that's impractical for large numbers.

Upvotes: 1

Gabor Szarnyas
Gabor Szarnyas

Reputation: 5057

If you have a predefined set of relationships in your database, you can just whitelist the relationship type, i.e. check if it is a member of the possible relationship type.

If that does not work you for some reason, you can still pass the label as a parameter and use the labels method in Cypher. However, this will be detrimental to the performance, as the database will query all nodes and check their types with a filtering operation.

I know this is a C# question, but let me share some Java code to reproduce this:

GraphDatabaseService gds = new TestGraphDatabaseFactory()
    .newImpermanentDatabaseBuilder().newGraphDatabase();
ApocHelper.registerProcedure(gds, Cypher.class);
gds.execute("CREATE (:MyLabel {attr1: 'hello'})");
Result result = gds.execute(
    "CALL apoc.cypher.run(" +
        "'MATCH (n) WHERE $label1 IN labels(n) RETURN keys(n) as keys LIMIT 1', " +
        "{label1: $label2}" +
    ") " +
    "YIELD value AS row " +
    "RETURN row",
    ImmutableMap.of("label2", "MyLabel")
);
System.out.println(result.next());

This returns:

{row={keys=[attr1]}}

But again, this is an academic solution and I cannot really think of a use case where this would be the best.

(In parallel to this question, there is a discussion in an APOC issue as well.)

Upvotes: 1

Related Questions