Ken Papagno
Ken Papagno

Reputation: 361

How do I refactor data two neo4j nodes to a relationship?

I'm doing an experiment with using a graph database (neo4j). I have two csv's that I imported into a neo4j datastore. I'm a little shakey on the neo terminology; so forgive me. Lets say I have:

Customer (AccountNumber, CustomerName) and CustomerGroups (AccountNumber, GroupName).

  1. I would like to create a new Node called groups which is comprised of the distinct GroupName from CustomerGroups. I'll call it Group.

  2. I then want to create relationships "HAS_GROUP" from Customer to Group using the common AccountNumber from CustomerGroups.

  3. Once the above is completed, I could delete CustomerGroups as its no longer needed.

I'm just stuck at the syntax. I can get the distinct groups from CustomerGroups with:

MATCH (n:CustomerGroups) distinct n.GROUP_NAME

and I get back about 50 distinct groups, but can't figure how to add the create statement to the results and CREATE g:Group {GroupName: n.GROUP_NAME}

I then know my followup question is how to do the MATCH to the new group using the old table with common account numbers.

FYI: I've indexed the AccountNumber in both Nodes. Both Customer and CustomerGroups have over 5 Million nodes. Not bad for a laptop (2 min to import using neo4j-import). I was impressed!

Thanks for any help you can give!

Upvotes: 1

Views: 70

Answers (1)

Brian Underwood
Brian Underwood

Reputation: 10856

Instead of creating a CustomerGroups label and creating nodes for that, you should be able to define relationships that you would like to create in your neo4j-import. It would certainly be a lot faster too. See:

http://neo4j.com/docs/stable/import-tool-header-format.html

To your question, you could probably do something like:

MATCH (cg:CustomerGroup)
MATCH (customer:Customer {AccountNumber: cg. AccountNumber}), (group:Group {GroupName: cg.GroupName})
CREATE (customer)-[:IN_GROUP]->(group)

You'd definitely want to make sure you have indexes on :Customer(AccountNumber) and :Group(GroupName) first. But even then it would still be much slower than doing it as part of your initial import.

Also, you may or may not want MERGE instead of CREATE

Upvotes: 0

Related Questions