Reputation: 2929
With a data model that looks similar to this:
(u:User)-[:POSTED]->(p:Post {created_at: 123)-[:ABOUT]->(t:Topic {name: "Blue")
What is the best way to find distinct count of users who posted/created a post with {created_at: 123} AND also don't have a post with {created_at: 124} about topic "Blue".
Closest I can get is to collect ids and then exclude them but that doesn't scale when you have a lot of nodes (millions).
[EDITED]
I also need the created_at
times to be specifiable as ranges.
Upvotes: 0
Views: 151
Reputation: 67009
This query allows you to specify created_at
ranges. In this example, the desirable range is [123..130], and the undesirable "Blue" range is [131..140]. In your actual query, the range endpoints should be specified by parameters.
MATCH (user:User)-[:POSTED]->(p1:Post)
WHERE 123 <= p1.created_at <= 130
WITH user
OPTIONAL MATCH (user)-[:POSTED]->(p2:Post)-[:ABOUT]->(:Topic{name:"Blue"})
WHERE 131 <= p2.created_at <= 140
WITH user, p2
WHERE p2 IS NULL
RETURN COUNT(DISTINCT user) AS userCount;
The OPTIONAL MATCH
clause is there to match the undesirable "Blue" paths, and the WHERE p2 IS NULL
clause will filter out user
nodes that have any such paths.
Upvotes: 1
Reputation: 30397
Assuming you have an index on Post.created_at
and Topic.name
(for speed), this should work:
MATCH (user:User)-[:POSTED|CREATED]->(:Post{created_at:123})
WHERE NOT EXISTS ( (user)-[:POSTED|CREATED]->(:Post{created_at:124})-[:ABOUT]->(:Topic{name:"Blue"}) )
RETURN count(DISTINCT user) as userCount
It's worth profiling that query, and if it isn't using the created_at and name indexes, supply that to the query with USING INDEX
after your MATCH
.
Upvotes: 0