Gary Frewin
Gary Frewin

Reputation: 565

Cypher virtual nodes to aggregate many to many relationships

I have 8000 Authors nodes and 2000 book nodes. Each author has a country property and each book can be co-authored by many authors.

I am trying to show which countries have collaborated on books by grouping all authors from a country into a single virtualNode called Country. Then I would need to create a virtual relationship between countries based on the number of times each author in CountryA has published something with an Author from CountryB.

I have a few queries that are doing stuff, but can't get it quite to what I want. Here is what I have tried and the outcome:

MATCH (a1:Author)-->(p:Paper)<--(a2:Author)
WHERE a1.country <> a2.country
WITH apoc.create.vNode(['Country'], {name: a1.country}) as Country1,  apoc.create.vNode(['Country'], {name: a2.country}) as Country2, count(p) as numCollabs
RETURN DISTINCT Country1, Country2, apoc.create.vRelationship(Country1, 'COLLABORATED_WITH', {numCollabs: numCollabs}, Country2) LIMIT 25

This shows many duplicate countries ( I assume 1 for each author rather than aggregating authors ) and the numCollabs on each relationship is always 1.

I also tried this one:

MATCH (a: Author)
WITH a.country as country, count(*) as count
RETURN apoc.create.vNode(['Country'], {name: country, authors: count}) as countries

This gives me the proper number of countries and shows the number of authors in that country... however, I cannot figure out how to then create the vRelationship which will show how many times each country has worked with each other country.

Upvotes: 0

Views: 390

Answers (1)

Nathan Smith
Nathan Smith

Reputation: 881

Try this. I tried to adapt the example here to your data.

MATCH (a:Author) 
WITH collect(distinct a.country) as countries
WITH [cName in countries | apoc.create.vNode(['Country'],{name:cName})] as countryNodes
WITH apoc.map.groupBy(countryNodes,'name') as countries
MATCH (a1:Author)-->(p:Paper)<--(a2:Author)
WHERE a1.country < a2.country
WITH a1.country AS countryName1, 
a2.country as countryName2, 
count(distinct p) as numCollabs, countries
RETURN countries[countryName1] as country1, 
countries[countryName2] as country2, 
apoc.create.vRelationship(countries[countryName1], 'COLLABORATED_WITH', {numCollabs: numCollabs}, countries[countryName2]) LIMIT 25

Upvotes: 1

Related Questions