Ger
Ger

Reputation: 754

Having trouble with indexing in Neo4j

I have a dataset with the following details:

I've batch imported the data into the database on my machine (64 bit, 16 core, 16 GB RAM) using https://github.com/jexp/batch-import/tree/20

I'm trying to index these nodes on Subscriber_ID, but I'm not really sure what I'm doing.

I ran

start n = node(*) set n:Subscribers

My understanding is this creates a label for each of the nodes (is this correct)

Next I ran

create index on :Subscribers(SUBSCRIBER_ID)

Which I think should create an index for all nodes with the 'Subscribers' label on the property 'SUBSCRIBER_ID'. (correct?)

Now when I go to Neo4j-sh and run

neo4j-sh (?)$ schema
==> Indexes
==>   ON :Subscribers(SU_SUBSCRIBER_ID) ONLINE  
==> 
==> No constraints

But when I run the following it says there are no indices set for the nodes.

neo4j-sh (?)$ index --indexes
==> Node indexes:
==> 
==> Relationship indexes: 

I have a few questions

  1. Do I have to tell it to index the existing data? If so how do I do that?
  2. How can I then use the index? I've read through the documentation but I had a bit of trouble following it.
  3. It looks like I can have the indexes set up when I run the batch import script, but I can't really understand how... could someone explain please?

Here's an example of my data:

Nodes.txt

id  SU_SUBSCRIBER_ID    CU_FIRST_NAME   gender  SU_AGE
0   123456                       Ann    F        56
1   832746                         ?    UNKNOWN  -1
2   546765                       Tom    UNKNOWN  -1
3   768345                     Anges    F        72
4   267854                  Aoibhlinn   F        38

rels.csv

start   end rel counter 
0            3  CONTACTED   2
1            2  CONTACTED   1
1            4  CONTACTED   1
3            2  CONTACTED   2
4            1  CONTACTED   1

Upvotes: 0

Views: 56

Answers (1)

Michael Hunger
Michael Hunger

Reputation: 41676

schema is the right command to look at.

Cypher uses the label indexes automatically for MERGE and MATCH.

With the Java Core-API you'd use db.findNodesByLabelAndProperty(label,property,value)

You did the right thing, except for one. You could have created the labels on the nodes while doing the batch-import.

Just add a l:label field to your CSV-file containing a comma separated list of labels per node. Like shown in the readme on that branch.

Upvotes: 1

Related Questions