Reputation: 57
I finally conquer the phase of importing nodes. Now I am trying to import relationships. There might be 1B relationships.
#!/bin/bash
cd /home/luning/neo4j-enterprise-2.2.0-RC01-unix/neo4j-enterprise-2.2.0-RC01/bin
users="/data/weibo/user-header.csv"
for i in /data/weibo/users/*
do
users=$users,$i
done
edges=/data/weibo/edge-header.csv,/data/weibo/ego/000000_0
./neo4j-import --stacktrace --into ../data/weibo_bak.db --nodes:User $users --relationships:Follow $edges --delimiter TAB --quote \' --bad-tolerance 50000 --id-type STRING
But there always says node missing. Unintelligibly, with importing same file for two trials, it gave me different missing node. 1. First Time
source: /data/weibo/ego/000000_0:1807199
startNode: 1587438071
endNode: 2414878813
type: Follow
refering to missing node 1587438071
java.lang.RuntimeException: Too many bad entries, saw 50001 where last one was InputRelationship:
source: /data/weibo/ego/000000_0:1807199
startNode: 1587438071
endNode: 2414878813
type: Follow
refering to missing node 1587438071
at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stillExecuting(StageExecution.java:63)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.anyStillExecuting(ExecutionSupervisor.java:79)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.finishAwareSleep(ExecutionSupervisor.java:102)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:64)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseDynamicExecution(ExecutionSupervisors.java:65)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:226)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:152)
at org.neo4j.tooling.ImportTool.main(ImportTool.java:263)
Caused by: org.neo4j.unsafe.impl.batchimport.input.InputException: Too many bad entries, saw 50001 where last one was InputRelationship:
source: /data/weibo/ego/000000_0:1807199
startNode: 1587438071
endNode: 2414878813
type: Follow
refering to missing node 1587438071
at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:47)
at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:27)
at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.incrementCount(CalculateDenseNodesStep.java:79)
at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:56)
at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:32)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:96)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:87)
at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:217)
2. Second Time
source: /data/weibo/ego/000000_0:1844245
startNode: 3492922617
endNode: 1589699375
type: Follow
refering to missing node 1589699375
java.lang.RuntimeException: Too many bad entries, saw 50001 where last one was InputRelationship:
source: /data/weibo/ego/000000_0:1844245
startNode: 3492922617
endNode: 1589699375
type: Follow
refering to missing node 1589699375
at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stillExecuting(StageExecution.java:63)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.anyStillExecuting(ExecutionSupervisor.java:79)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.finishAwareSleep(ExecutionSupervisor.java:102)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:64)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseDynamicExecution(ExecutionSupervisors.java:65)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:226)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:152)
at org.neo4j.tooling.ImportTool.main(ImportTool.java:263)
Caused by: org.neo4j.unsafe.impl.batchimport.input.InputException: Too many bad entries, saw 50001 where last one was InputRelationship:
source: /data/weibo/ego/000000_0:1844245
startNode: 3492922617
endNode: 1589699375
type: Follow
refering to missing node 1589699375
at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:47)
at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:27)
at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.incrementCount(CalculateDenseNodesStep.java:79)
at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:59)
at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:32)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:96)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:87)
at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:217)
But for these two nodes 1587438071 and 2765561213, I can make sure their are in my files. Because i can find them.
[luning@pinnacle data]$ grep 1587438071 /data/weibo/users/*
/data/weibo/users/000024_0:1587438071 琬童沛胜 浙江 杭州 http://tp4.sinaimg.cn/1587438071/50/40024579617/0 f 147 60 272 false LV2 31 一举成名| 正常 80 2014-02-17 04:17:38
[luning@pinnacle data]$ grep 1589699375 /data/weibo/users/*
/data/weibo/users/000010_0:1589699375 在行动Isabella 吉林 http://tp4.sinaimg.cn/1589699375/50/5633181098/0 女 297 438 4729 1981-01-17 false LV7 2014-08-13 21:43:34 2014-01-28 10:18:52
So, anybody who can figure it out how it would happen?
Upvotes: 0
Views: 503
Reputation: 3054
Could be that your node input file(s) contains fields that doesn't close their quotes properly, which would have some lines "eaten" by other lines, effectively not importing those nodes (if the alignment of the fields would happen to end up like that, otherwise throw exception). Or it could be something wrong with the parser in the face of these chinese characters.
Any chance you could share you input data with me (the main author of parser and the import tool) for investigation?
Upvotes: 1