Create unique nodes and make labels with multiple values

Question

I'm a noob in neo4j, and my question can look very simple. I have CSV file, with next structure:

Id is a Persons ID, and Fam is a project ID, where this person took part. I want to connect persons if they worked on the same project.

What is the best data model in this case? First thing came into my mind is to make id - node, and Fam label. But I don't know how to load multiple labels to one node. Second is to make both id and Fam as nodes, and then make a query to show related employees.

For second case code will be like this:

LOAD CSV WITH HEADERS FROM 'file:///PNG20161202.csv' AS line
MERGE (n:id {Person_id: toInt(line.id)})
WITH line, n
MERGE (m:Fam {Fam_id: toInt(line.Fam)})
WITH m,n
MERGE (n)-[:WORK_IN]->(m);

But I don't know how to display only related id. (I need to export and visualize this network in Gephi, only id)

For the first case, I know how to make relations between id, but don't know how to write LOAD CSV query which will make id with multilabel.

Suggestions are very appreciated.

InverseFalcon · Accepted Answer

I believe you're thinking of this too much from the perspective of tables and your current data, so you're missing the bigger picture of what you want to model. With graph databases, it's easier to think in terms of entities (important "things" that you are modeling) and the relationships between them.

This, I think, was the most important part of your description:

"Id is a Persons ID, and Fam is a project ID, where this person took part. I want to connect persons if they worked on the same project."

The important "things" you mention are Persons and Projects. So it seems to me that these are the labels you should be working with, :Person and :Project. IDs tend to be unique, so they should likely be properties on :Person and :Project nodes, with unique constraints for the labels and ID properties.

You can set up your unique constraints like so:

CREATE CONSTRAINT ON (p:Person)
ASSERT p.ID IS UNIQUE

CREATE CONSTRAINT ON (pr:Project)
ASSERT pr.ID IS UNIQUE

Your import would only be connecting :Persons to :Projects they worked on.

LOAD CSV WITH HEADERS FROM 'file:///PNG20161202.csv' AS line
MERGE (n:Person {ID: toInt(line.id)})
MERGE (m:Project {ID: toInt(line.Fam)})
MERGE (n)-[:WORKED_ON]->(m);

Once you have this, it should be easy to query for :Persons who worked on the same :Project, and you don't need a LOAD CSV for that.

EDIT

For creating :KNOWS relationships between :Persons who worked on the same :Projects, you can use this query:

MATCH (p1:Person)-[:WORKED_ON]->(:Project)<-[:WORKED_ON]-(p2:Person)
WITH DISTINCT p1, p2
MERGE (p1)-[:KNOWS]-(p2)

Create unique nodes and make labels with multiple values

Answers (1)

Related Questions