Reputation: 17676
I have a graph in neo4j with vertices of:
person:ID,name,value:int,:LABEL
1,Alice,1,Person
2,Bob,0,Person
3,Charlie,0,Person
4,David,0,Person
5,Esther,0,Person
6,Fanny,0,Person
7,Gabby,0,Person
8,XXXX,1,Person
and edges:
:START_ID,:END_ID,:TYPE
1,2,call
2,3,text
3,2,text
6,3,text
5,6,text
5,4,call
4,1,call
4,5,text
1,5,call
1,8,call
6,8,call
6,8,text
8,6,text
7,1,text
imported into neo4j like:
DATA_DIR_SAMPLE=/data_network/
$NEO4J_HOME/bin/neo4j-admin import --mode=csv \
--database=graph.db \
--nodes:Person ${DATA_DIR_SAMPLE}/vertices.csv \
--relationships ${DATA_DIR_SAMPLE}/edges.csv
Now when querying the graph like:
MATCH (source:Person)-[*1]-(destination:Person)
RETURN source.name, source.value, avg(destination.value), 'undir_1_any' as type
UNION ALL
MATCH (source:Person)-[*2]-(destination:Person)
RETURN source.name, source.value, avg(destination.value), 'undir_2_any' as type
one can see that the graph is traversed multiple times, and additionally as I want to obtain a table like:
Vertex | value | type_undir_1_any | type_undir_2_any
Alice | 1 | 0.2 | 0
an additional aggregation step (pivot/reshape) would be required
In the future, I would like to add the following patterns
Is there a better way to combine the queries?
Upvotes: 1
Views: 97
Reputation: 29172
You need to aggregate along the path length, while with a custom function of calculating the average value:
MATCH p = (source:Person)-[*1..2]-(destination:Person)
WITH
length(p) as L, source, destination
RETURN
source.name as Vertex,
source.value as value,
1.0 *
sum(CASE WHEN L = 1 THEN destination.value ELSE 0 END) /
sum(CASE WHEN L = 1 THEN 1 ELSE 0 END) as type_undir_1_any,
1.0 *
sum(CASE WHEN L = 2 THEN destination.value ELSE 0 END) /
sum(CASE WHEN L = 2 THEN 1 ELSE 0 END) as type_undir_2_any
Or a more elegant version with function from the APOC library to calculate the average on the collection:
MATCH p = (source:Person)-[*1..2]-(destination:Person)
RETURN
source.name as Vertex,
source.value as value,
apoc.coll.avg(COLLECT(
CASE WHEN length(p) = 1 THEN destination.value ELSE NULL END
)) as type_undir_1_any,
apoc.coll.avg(COLLECT(
CASE WHEN length(p) = 2 THEN destination.value ELSE NULL END
)) as type_undir_2_any
Upvotes: 1