Reputation: 1
I am new to using Neo4j and have setup a test graph db in neo4j for organizing some click stream data with a very small subset of what we actually use on a day to day basis. This graph has about 23 million nodes and 34 million relationships. The queries seem to be taking forever to run i.e. I haven't seen the response come back even after waiting for more than 30 mins.
The data is organized as Year->Month->Day->Session{1..n}->Event{1..n}
I am running the db on a Windows 7 machine with 1.5 gb of heap allocated to Neo4j server
These are the configurations in the neo4j-wrapper.conf
wrapper.java.additional.1=-Dorg.neo4j.server.properties=conf/neo4j-server.properties
wrapper.java.additional.2=-Djava.util.logging.config.file=conf/logging.properties
wrapper.java.additional.3=-Dlog4j.configuration=file:conf/log4j.properties
wrapper.java.additional.6=-XX:+UseParNewGC
wrapper.java.additional.7=-XX:+UseConcMarkSweepGC
wrapper.java.additional.8=-Xloggc:data/log/neo4j-gc.log
wrapper.java.initmemory=1500
wrapper.java.maxmemory=1500
This is what my query looks like
START n=node(3)
MATCH (n)-[:HAS]->(s)
WITH distinct s
MATCH (s)-[:HAS]->(e) WHERE e.page_name = 'Login'
WITH s.session_id as session, e
MATCH (e)-[:FOLLOWEDBY*0..1]->(e1)
WITH count(session) as session_cnt, e.page_name as startPage, e1.page_name as nextPage
RETURN startPage, nextPage, session_cnt
Also i have these properties set
node_auto_indexing=true
node_keys_indexable=name,page_name,geo_country
relationship_auto_indexing=true
Can anyone help me to figure out what might be wrong.
Even when I run portions of the query it takes 10-15 minutes before I can see a response.
Note: I have no other applications running on the Windows Machine
Upvotes: 0
Views: 263
Reputation: 192
Why would you want to return all the nodes in the first place?
If you really want to do that, use the transactional http endpoint and curl to stream the response:
I tested it with a database of 100k nodes. It takes 0.9 seconds to transfer them (1.5MB) over the wire. If you transfer all their properties by using "return n", it takes 1.4 seconds and results in 4.1MB transferred.
If you just want to know how many nodes are in your db. use something like this instead:
match (n) return count(*);
Upvotes: 0