sharjeel
sharjeel

Reputation: 6005

Neo4j graph database design for combining data from different sources

I am trying to analyze data from two different running systems by modeling them as a graph. Many of the users in the two systems are common but could have different properties: e.g. some users have omitted their middle names or prefixes in one system but not in other.

For my analysis, I'd like to consolidate users across the systems as logically one for their relationships. I'm not sure how I should be storing them. Should I store two different nodes and logically group them together as one during my queries? Or should I store them as one node at the first place? How would I store varying properties like slight variations in the names for the same person?

Upvotes: 1

Views: 117

Answers (1)

sbhatt
sbhatt

Reputation: 485

I think it would be better to create only one node for each user as, they are not different users and this way you can have advantage of property graphs.

You can create two properties, e.g. for name each user node will have,
name-system1 and name-system2.

Creating two different user nodes for same users and then connecting them with a relationship will affect your query performance so better to go for one node for each user.

I hope it helps!

Upvotes: 1

Related Questions