Having trouble with indexing in Neo4j

Question

I have a dataset with the following details:

1.4 million nodes
2.9 million relationships
15 million properties (including gender, name, subscriber_id etc)
1 relationship type (Contacted)

I've batch imported the data into the database on my machine (64 bit, 16 core, 16 GB RAM) using https://github.com/jexp/batch-import/tree/20

I'm trying to index these nodes on Subscriber_ID, but I'm not really sure what I'm doing.

I ran

start n = node(*) set n:Subscribers

My understanding is this creates a label for each of the nodes (is this correct)

Next I ran

create index on :Subscribers(SUBSCRIBER_ID)

Which I think should create an index for all nodes with the 'Subscribers' label on the property 'SUBSCRIBER_ID'. (correct?)

Now when I go to Neo4j-sh and run

neo4j-sh (?)$ schema
==> Indexes
==>   ON :Subscribers(SU_SUBSCRIBER_ID) ONLINE  
==> 
==> No constraints

But when I run the following it says there are no indices set for the nodes.

neo4j-sh (?)$ index --indexes
==> Node indexes:
==> 
==> Relationship indexes:

I have a few questions

Do I have to tell it to index the existing data? If so how do I do that?
How can I then use the index? I've read through the documentation but I had a bit of trouble following it.
It looks like I can have the indexes set up when I run the batch import script, but I can't really understand how... could someone explain please?

Here's an example of my data:

Nodes.txt

id  SU_SUBSCRIBER_ID    CU_FIRST_NAME   gender  SU_AGE
0   123456                       Ann    F        56
1   832746                         ?    UNKNOWN  -1
2   546765                       Tom    UNKNOWN  -1
3   768345                     Anges    F        72
4   267854                  Aoibhlinn   F        38

rels.csv

start   end rel counter 
0            3  CONTACTED   2
1            2  CONTACTED   1
1            4  CONTACTED   1
3            2  CONTACTED   2
4            1  CONTACTED   1

Having trouble with indexing in Neo4j

Answers (1)

Related Questions