Reputation: 17676
in neo4j.
CREATE (Alice:Person {id:'a', fraud:1})
CREATE (Bob:Person {id:'b', fraud:0})
CREATE (Charlie:Person {id:'c', fraud:0})
CREATE (David:Person {id:'d', fraud:0})
CREATE (Esther:Person {id:'e', fraud:0})
CREATE (Fanny:Person {id:'f', fraud:0})
CREATE (Gabby:Person {id:'g', fraud:0})
CREATE (Fraudster:Person {id:'h', fraud:1})
CREATE
(Alice)-[:CALL]->(Bob),
(Bob)-[:SMS]->(Charlie),
(Charlie)-[:SMS]->(Bob),
(Fanny)-[:SMS]->(Charlie),
(Esther)-[:SMS]->(Fanny),
(Esther)-[:CALL]->(David),
(David)-[:CALL]->(Alice),
(David)-[:SMS]->(Esther),
(Alice)-[:CALL]->(Esther),
(Alice)-[:CALL]->(Fanny),
(Fanny)-[:CALL]->(Fraudster)
neo4j percentage of attribute for social network allows to easily calculate the fraudulence percentage of a social network:
MATCH (:Person)-[:CALL|:SMS]->(f:Person)
WITH TOFLOAT(COUNT(*))/100 AS divisor, COLLECT(f) AS fs
UNWIND fs AS f
WITH divisor, f
WHERE f.fraud = 1
RETURN f, COUNT(*)/divisor AS percentage
How can I modify this to use multiple matches for the different types of relations - but still only require a single pass over the graph? I.e. have something more efficient than simply calling the following 3 statements:
MATCH (:Person)-[:CALL]->(f:Person)
WITH TOFLOAT(COUNT(*))/100 AS divisor, COLLECT(f) AS fs
UNWIND fs AS f
WITH divisor, f
WHERE f.fraud = 1
RETURN f, COUNT(*)/divisor AS percentage
MATCH (:Person)-[:SMS]->(f:Person)
WITH TOFLOAT(COUNT(*))/100 AS divisor, COLLECT(f) AS fs
UNWIND fs AS f
WITH divisor, f
WHERE f.fraud = 1
RETURN f, COUNT(*)/divisor AS percentage
MATCH (:Person)-[:CALL|:SMS]->(f:Person)
WITH TOFLOAT(COUNT(*))/100 AS divisor, COLLECT(f) AS fs
UNWIND fs AS f
WITH divisor, f
WHERE f.fraud = 1
RETURN f, COUNT(*)/divisor AS percentage
But rather have something which returns percentage_total, percentage_sms, percentage_phone
?
Upvotes: 0
Views: 61
Reputation: 5047
If you would like to keep the results together, you need to chain the queries using WITH
and pass along the f
variable for the person. Unfortunately, you also have to keep passing all percentage_*
variables in all WITH
clauses, so it gets quite difficult to maintain:
MATCH (f:Person)
OPTIONAL MATCH (:Person)-[:CALL|:SMS]->(f)
WITH TOFLOAT(COUNT(*))/100 AS divisor, COLLECT(f) AS fs
UNWIND fs AS f
WITH divisor, f
WHERE f.fraud = 1
WITH f, COUNT(*)/divisor AS percentage_all
OPTIONAL MATCH (:Person)-[:CALL]->(f)
WITH TOFLOAT(COUNT(*))/100 AS divisor, COLLECT(f) AS fs, percentage_all
UNWIND fs AS f
WITH divisor, f, percentage_all
WHERE f.fraud = 1
WITH f, percentage_all, COUNT(*)/divisor AS percentage_phone
OPTIONAL MATCH (:Person)-[:SMS]->(f)
WITH TOFLOAT(COUNT(*))/100 AS divisor, COLLECT(f) AS fs, percentage_all, percentage_phone
UNWIND fs AS f
WITH divisor, f, percentage_all, percentage_phone
WHERE f.fraud = 1
RETURN f, percentage_all, percentage_phone, COUNT(*)/divisor AS percentage_sms
The openCypher project proposed nested subqueries, but this will take some time to make to Neo4j.
Upvotes: 2