Reputation: 499
I have a JSON dataset of papers and where they were published, like such:
{"id": "1", "title": "Paper1", "venue": {"raw": "Journal of Cell Biology"}}
{"id": "2", "title": "Paper2", "venue": {"raw": "Nature"}}
{"id": "3", "title": "Paper3", "venue": {"raw": "Journal of Histochemistry and Cytochemistry"}}
I want to create nodes for only the papers published in a certain journal, say Nature
, and add a relationship between the paper node and an existing journal node. That is, I want to create nodes for only the lines of data with a certain value for the venue.raw
key.
The code I am working with is below. I think I need to add some logic to the apoc.load.json
part so that it only matches data where $.venue.raw == 'Nature'
:
CALL apoc.load.json('file:/example.txt', '$.venue.raw') YIELD value AS q
CREATE (p:Quanta {id:q.id, title:q.title})
WITH q, p
UNWIND q.venue as venue
MATCH (v:Venue {name: venue.raw})
CREATE (p)-[:PUBLISHED_IN_VENUE]->(v)
Is there a way that I can alter this so to import only the relevant data?
Any help would be greatly appreciated!
Upvotes: 1
Views: 82
Reputation: 4052
I assume venues are present in your database, If it's not load it first or change the query to CREATE/MERGE it.
This query will filter based on the value provided, You can later change this to accept venue as a parameter.
CALL apoc.load.json('file:/example.txt') YIELD value AS q
WHERE q.venue.raw="Nature"
CREATE (p:Quanta {id:q.id, title:q.title})
WITH p,q
MATCH (v:Venue {name: q.venue.raw})
CREATE (p)-[:PUBLISHED_IN_VENUE]->(v)
Upvotes: 1