Morten V. Gade
Morten V. Gade

Reputation: 89

Neo4j: Duplicate in match

I have the Cypher query:

match(p:Product)--(stock:Stock) 
where p.StyleNumber = "Z94882A" and p.Color = "Black" and stock.Retailer = "11" 
with stock as stock, p as p  
optional match(p)-->(s:Sale) 
where s.Date = 20170801 and s.Retailer = "11"  
return p,stock, s

In the match I get duplicate stock results back:

Duplicate in Size XL

The result contains two stock entries for Size:XL (see picture), since it has two sales. But in the first query match(p:Product)--(stock:Stock) only returns one product node with just one stock node attached :-/. See this graph for illustration:

Graph for Size XL

If i sum the sales and stock, I still get duplicates in the result:

match(p:Product)--(stock:Stock) 
where p.StyleNumber = "Z94882A" and p.Color = "Black" and stock.Retailer = "11" 
with stock as stock, p as p  
optional match(p)-->(s:Sale) 
where s.Date = 20170801 and s.Retailer = "11"  
return p, sum(stock.Stockcount) as onstock, sum(s.Quantity) as sold

Which gives for XL:

{"Size":"XL", "Color":"Black", "Supplier":"1", "Id":"6322", "StyleNumber":"Z94882A"}
| Stock =10 (should have been 5)       │ Sold =2 (is correct) 

The question is how to avoid duplicates in this scenario?

Upvotes: 2

Views: 813

Answers (1)

InverseFalcon
InverseFalcon

Reputation: 30397

You'll need to aggregate in stages, since the :Stock nodes are independent of the :Sale nodes. Try aggregating early, so you avoid creating cartesian products between them.

match(p:Product)--(stock:Stock) 
where p.StyleNumber = "Z94882A" and p.Color = "Black" and stock.Retailer = "11" 
with sum(stock.Stockcount) as onstock, p
optional match(p)-->(s:Sale) 
where s.Date = 20170801 and s.Retailer = "11"  
return p, onstock, sum(s.Quantity) as sold

Upvotes: 1

Related Questions