Reputation: 376
I am using the opinions.csv data from the FossilWorks database for Synapsids. This is public data. The opinions are links between different taxa and the published works where the opinion is stated. The opinions express relationships like 'It is the scientific opinion of F. Ameghino, as published in 'Première contribution à la connaissance de la faune mammalogique des couches à Pyrotherium" from 1889, that Astrapotherium ephebicum is a species of the genus Parastrapotherium.' In this example, Parastrapotherium is the 'parent taxon' and Astrapotherium ephebicum is the 'child taxon'. There are other forms of opinions that do not have a 'parent taxon', so this value is optional. The file is uses the iso-8859-1 character set, so it was first converted using
iconv -f iso-8859-1 -t utf-8 references.csv > references.txt
Using Neo4j Desktop (Version: 1.1.5), I can load the references and link it to the child taxa with:
USING PERIODIC COMMIT
load csv with headers from "file:///opinions.txt" as row
FIELDTERMINATOR "|"
with row where row.status = "belongs to"
match (ref:Reference {reference_no:toInteger(row.reference_no)})
match (child:Taxon {taxon_no:toInteger(row.child_no)})
merge (st:Subtaxon {opinion_no:toInteger(row.opinion_no)})
FOREACH(ignoreMe IN CASE WHEN trim(row.pages) = "" then [] else [1] end | SET st.pages = row.pages)
FOREACH(ignoreMe IN CASE WHEN trim(row.comments) = "" then [] else [1] end | SET st.comments = row.comments)
merge (ref)-[:DESCRIBES]->(st)
merge (st)-[:CHILD]->(child);
However, I was unable to add the parent nodes without getting the rather opaque "Neo.DatabaseError.General.UnknownError". This is the simplest query that will produce the error:
USING PERIODIC COMMIT
load csv with headers from "file:///opinions.txt" as row
FIELDTERMINATOR "|"
with row where row.status = "belongs to" and row.parent_no is not null
match (parent:Taxon {taxon_no:toInteger(row.parent_no)})
match (st:Subtaxon {opinion_no:toInteger(row.opinion_no)})
merge (st)-[:PARENT]->(parent);
I used csvstat to verify that parent_no is an integer. I'm not sure how to efficiently debug this error. Any suggestions would be greatly appreciated.
Upvotes: 0
Views: 666
Reputation: 376
Logisima has the right clue. I updated from 3.4.0 to 3.4.1 and I was able to load the data. So this looks to be a bug that was resolved in 3.4.1
Upvotes: 1