Reputation: 77
Very similar to the question posted here
I have the following nodes: Article and Words. Each word is connected to an article by a MENTIONED
relationship.
I need to query all articles that have common words where the list of common words is dynamic. From the clients perspective, I am passing back a list of words and expecting back a results of articles that have those words in common.
The following query does the job
WITH ["orange", "apple"] as words
MATCH (w:Word)<-[:MENTIONED]-(a:Article)-[:MENTIONED]->(w2:Word)
WHERE w.name IN words AND w2.name IN words
RETURN a, w, w2
but does not work with word list of one. How can I make it handle any number of words? Is there a better way to do this?
Upvotes: 2
Views: 1852
Reputation: 30417
Yes. There are two approaches I can think of:
Finding all articles that contain some subset of those words, and then returning only articles where the number of words mentioned is the number of words you supplied in your wordlist.
Getting the :Word nodes for the given list of words, and then getting articles where all words are mentioned in the article.
Here's an example graph to test this on:
MERGE (a1:Article {name:'a1'}),
(a2:Article {name:'a2'}),
(a3:Article {name:'a3'})
MERGE (w1:Word{name:'orange'}),
(w2:Word{name:'apple'}),
(w3:Word{name:'pineapple'}),
(w4:Word{name:'banana'})
MERGE (a1)-[:MENTIONED]->(w1),
(a1)-[:MENTIONED]->(w2),
(a1)-[:MENTIONED]->(w3),
(a1)-[:MENTIONED]->(w4),
(a2)-[:MENTIONED]->(w1),
(a2)-[:MENTIONED]->(w4),
(a3)-[:MENTIONED]->(w1),
(a3)-[:MENTIONED]->(w2),
(a3)-[:MENTIONED]->(w3)
Approach 1, comparing the wordlist size to the number of words mentioned in the article, looks like this:
WITH ["orange", "apple"] as words
MATCH (word:Word)<-[:MENTIONED]-(article:Article)
WHERE word.name IN words
WITH words, article, COUNT(word) as wordCount
WHERE wordCount = SIZE(words)
RETURN article
This only works if there is ever only one :MENTIONED relationship between an article and a mentioned word, no matter how many times that word is mentioned.
Approach 2 is using ALL() on the collection of :Words to ensure that we match on an article where all words are mentioned:
WITH ["orange", "apple"] as words
MATCH (word:Word)
WHERE word.name in words
WITH COLLECT(word) as words
MATCH (article:Article)
WHERE ALL (word in words WHERE (word)<-[:MENTIONED]-(article))
RETURN article
You can try using PROFILE with each of these to figure out which works best with your data set.
Upvotes: 4