Reputation: 355
I've been playing with neo4j for a geneology site and it's worked great!
I've run into a snag where finding the starting node isn't as easy. Looking through the docs and the posts online I haven't seen anything that hints at this so maybe it isn't possible.
What I would like to do is pass in a list of genders and from that list follow a specific path through the nodes to get a single node.
in context of the family:
I want to get my mother's father's mother's mother. so I have my id so I would start there and traverse four nodes from mine.
so pseudo query would be
select person (follow childof relationship)
where starting node is me
where firstNode.gender == female
AND secondNode.gender == male
AND thirdNode.gender == female
AND fourthNode.gender == female
Upvotes: 1
Views: 870
Reputation: 11735
Focusing on the general solution:
MATCH p = (me:Person)-[:IS_CHILD_OF*]->(ancestor:Person)
WHERE me.uuid = {uuid}
AND length(p) = size({genders})
AND extract(x in tail(nodes(p)) | x.gender) = {genders}
RETURN ancestor
here's how it works:
nodes(p)
returns all the nodes in the path, including the starting nodetail(nodes(p))
skips the first element of the list, i.e. the starting node, so now we only have the ancestorsextract()
extracts the genders of all the ancestor nodes, i.e. it transforms the list of ancestor nodes into their gendersHowever, I don't think it will be faster than the explicit solution, though the performance could remain comparable. On my small test data (just 5 nodes), the general solution does 26 DB accesses whereas the specific solution only does 22, as reported by PROFILE
. Further profiling would be needed on a larger database to compare the performances:
PROFILE MATCH p = (me:Person)-[:IS_CHILD_OF*]->(ancestor:Person)
WHERE me.uuid = {uuid}
AND length(p) = size({genders})
AND extract(x in tail(nodes(p)) | x.gender) = {genders}
RETURN ancestor
The general solution has the advantage of being a single query which won't need to be parsed again by the Cypher engine, whereas each generated query will need to be parsed.
Upvotes: 2
Reputation: 11735
I'm not sure if you want a generic query which can work whatever the collection of genders you pass, or a specific solution.
Here's the specific solution: you match the path with the wanted length, and match each gender, as you've already noted in your own answer.
MATCH (me:Person)-[:IS_CHILD_OF]->(p1:Person)
-[:IS_CHILD_OF]->(p2:Person)
-[:IS_CHILD_OF]->(p3:Person)
-[:IS_CHILD_OF]->(p4:Person)
WHERE me.uuid = {uuid}
AND p1.gender = {genders}[0]
AND p2.gender = {genders}[1]
AND p3.gender = {genders}[2]
AND p4.gender = {genders}[3]
RETURN p4
Now, if you want to pass in a list of genders of an arbitrary length, it's actually possible. You match a variable-length path, make sure it has the right length (matching the number of genders), then match each gender in sequence.
MATCH p = (me:Person)-[:IS_CHILD_OF*]->(ancestor:Person)
WHERE me.uuid = {uuid}
AND length(p) = size({genders})
AND all(i IN range(0, size({genders}) - 1)
WHERE {genders}[i] = extract(x in tail(nodes(p)) | x.gender)[i])
RETURN ancestor
Building on @InverseFalcon's answer, you can actually compare collections, which simplifies the query:
MATCH p = (me:Person)-[:IS_CHILD_OF*]->(ancestor:Person)
WHERE me.uuid = {uuid}
AND length(p) = size({genders})
AND extract(x in tail(nodes(p)) | x.gender) = {genders}
RETURN ancestor
Upvotes: 1
Reputation: 30407
The equivalent query would look something like this:
MATCH (me:Person)
WHERE me.ID = ?
WITH me
MATCH (me)-[r:childof*4]->(ancestor:Person)
WITH ancestor, EXTRACT(rel IN r | endNode(rel).gender) AS genders
WHERE genders = ?
RETURN ancestor
Disclaimer, I haven't double-checked the syntax.
In Neo4j you typically find your start node first, typically by an ID of some sort (modify as required to match on a unique property). We then traverse a number of relationships to an ancestor, extract the gender property of all end nodes in the traversed relationships, and compare the genders to the expected list of genders (you'll need to make sure the argument is a bracketed list in the desired order).
Note that this approach filters down all possible results with that degree of childof relationship as opposed to walking your graph, so higher degrees of relationship (the higher the degree of ancestry you're querying), the slower the call will get.
I'm also unsure if you can parameterize the degree of the variable relationship, so that might prevent this from being a generalized solution for any degree of ancestry.
Upvotes: 1
Reputation: 355
It was more simple than I thought. Maybe there is still a better way so I'll leave this open for a bit.
the query would be
MATCH (n1:Person { Id: 'f59c40de-506d-4829-a765-7a3ae94af8d1' })
<-[:CHILDOF]-(n2 { Gender:'0'})
<-[:CHILDOF]-(n3 { Gender:'1'})
<-[:CHILDOF]-(n4 { Gender:'1'})
RETURN n4
and for each generation back would add a new row.
Upvotes: 1