Programmer
Programmer

Reputation: 1294

Extending layers of nodes in neo4j graph

I have two node types "Person" and "Company". There is also a relationship called "Shareholder", which can be from Person to Company or Company to Company (as companies can also hold shares in other companies). The Shareholder relationship has three properties "shareholding_start_date", "shareholding_end_date", "number_of_shares".

I have a query which takes in Company name (call it 'ABC'), as well as provide a date (e.g. '2021-07-23'), and the query returns all Person/Company nodes that are shareholders of c at the date '2021-07-23' (i.e. shareholding_start_date < 2021-07-23 < shareholding_end_date). It also calculates the percentage based on 'number_of_shares'.

The working query is below:

MATCH (c:Company {name:”ABC”})<-[r:Shareholder]-(s)
WHERE R.shareholding_start_date <= Date(“2021-07-23”) 
   AND r.shareholding_end_date >= DATE(“2021-07-23”)
WITH c, 
collect(s) as shareholders, 
collect(r) as rels, 
sum(r.number_of_shares) AS totalShares
UNWIND range(0, size(rels) - 1) AS I
RETURN shareholders[i].name AS shareholder, 
100.0 * rels[i].number_of_shares/totalShares as percentTotalShares

I am looking to extend this so that it also shows shareholders of shareholders, up to a certain layer limit which can be defined in the query (let's say 3, for example). So in the above example, the query will return all shareholders of company 'ABC', which may be "Person" or "Company" node types. In the case where the shareholders of 'ABC' are "Company" nodes, I basically want to repeat the same code on them and list all their Shareholders, along with the shareholding % in that company. This should continue until we reach the limit (i.e. 3).

Upvotes: 0

Views: 81

Answers (1)

cybersam
cybersam

Reputation: 67009

Assumptions

  1. The date constraint is applied to all relationships in every path to the specified company ("ABC").
  2. The end date property does not exist if shares are still currently held.
  3. If an entity sells only a portion of its shares from a given purchase, the end date property is added, the number of shares is reduced to the number sold, and the remaining shares are used to create a new relationship with the same start date. (This is not implemented in my answer, since the use case does not sell any shares).
  4. The date constraint is also enforced when calculating totalShares for a company.
  5. A company can hold some of its own shares.
  6. You pass the target date and company name in the parameters $targetDate and $company.

Data model

I simplified the names in the data model for easier reading:

(:Person|Company {name})-[:HOLDS {start, end, shares}]->(:Company {name})

Cypher query

Here is one way to implement your use case:

WITH DATE($targetDate) AS targetDate

// Find all paths up to length 3 involving the specified company, on the target date.
MATCH ()-[rs:HOLDS*..3]->(:Company {name: $company})
WHERE ALL(r IN rs WHERE r.start <= targetDate AND (r.end IS NULL OR targetDate <= r.end))

// Collect the distinct HOLD relationships in the paths for each company.
WITH targetDate, ENDNODE(rs[0]) AS c, COLLECT(DISTINCT rs[0]) AS rels

// Calculate total number of shares owned by each company in the paths, on the target date
UNWIND rels AS rel
WITH c, rels, SUM(rel.shares) AS totalShares

// Return a company, a shareholder, and the percentage of that company owned by the shareholder on the target date.
UNWIND rels AS rel
WITH totalShares, c.name AS company, STARTNODE(rel).name AS holder, SUM(rel.shares) AS shares
RETURN company, holder, 100.0*shares/totalShares AS pct

Test data

CREATE (p1:Person {name: 'John'}),
       (p2:Person {name: 'Dave'}),
       (p3:Person {name: 'Alice'})

CREATE (c1:Company {name: 'Comp1'}),
       (c2:Company {name: 'Comp2'}),
       (c3:Company {name: 'Comp3'}),
       (c4:Company {name: 'ABC'})

CREATE (p1)-[:HOLDS {start: date('2018-01-01'), end: date('2022-12-31'), shares: 1000}]->(c1)
CREATE (c2)-[:HOLDS {start: date('2020-01-01'), end: date('2025-12-31'), shares: 750}]->(c1)
CREATE (p3)-[:HOLDS {start: date('2019-01-01'), end: date('2022-12-31'), shares: 800}]->(c3)
CREATE (c1)-[:HOLDS {start: date('2016-07-01'), end: date('2023-12-31'), shares: 800}]->(c4)

// Dave bought 500 shares of Comp1, and later bought 250 more.
// Then he sold 125 shares from the second batch on '2023-06-30' and still has the remaining 125 shares.
CREATE (p2)-[:HOLDS {start: date('2020-01-01'), end: date('2023-12-31'), shares: 500}]->(c1)
CREATE (p2)-[:HOLDS {start: date('2023-01-01'), end: date('2023-06-30'), shares: 125}]->(c1)
CREATE (p2)-[:HOLDS {start: date('2023-01-01'), shares: 125}]->(c1)

// ABC and Comp2 own shares of each other
CREATE (c2)-[:HOLDS {start: date('2017-01-01'), end: date('2023-12-31'), shares: 700}]->(c4)
CREATE (c4)-[:HOLDS {start: date('2016-07-01'), shares: 500}]->(c2)

// Comp1 holds (and continues to hold) some of its own shares
CREATE (c1)-[:HOLDS {start: date('2021-01-01'), shares: 500}]->(c1)

Results

Using $targetDate = "2023-07-21", and $company = "ABC":

╒═══════╤═══════╤══════════════════╕
│company│holder │pct               │
╞═══════╪═══════╪══════════════════╡
│"ABC"  │"Comp2"│46.666666666666664│
├───────┼───────┼──────────────────┤
│"ABC"  │"Comp1"│53.333333333333336│
├───────┼───────┼──────────────────┤
│"Comp2"│"ABC"  │100.0             │
├───────┼───────┼──────────────────┤
│"Comp1"│"Comp2"│40.0              │
├───────┼───────┼──────────────────┤
│"Comp1"│"Dave" │33.333333333333336│
├───────┼───────┼──────────────────┤
│"Comp1"│"Comp1"│26.666666666666668│
└───────┴───────┴──────────────────┘

Matching graph

Here is a graph of the paths found by the above query's first MATCH/WHERE pair.

enter image description here

Here is the visualization query (a minor variant of the first 3 clauses of above query):

WITH DATE($targetDate) AS targetDate
MATCH (h)-[rs:HOLDS*..3]->(c:Company {name: $company})
WHERE ALL(r IN rs WHERE r.start <= targetDate AND (r.end IS NULL OR targetDate <= r.end))
RETURN *

Upvotes: 1

Related Questions