Reputation: 57
I've got a couple hundred JSONs in a structure like the following example:
{
"JsonExport": [
{
"entities": [
{
"identity": "ENTITY_001",
"surname": "SMIT",
"entityLocationRelation": [
{
"parentIdentification": "PARENT_ENTITY_001",
"typeRelation": "SEEN_AT",
"locationIdentity": "LOCATION_001"
},
{
"parentIdentification": "PARENT_ENTITY_001",
"typeRelation": "SEEN_AT",
"locationIdentity": "LOCATION_002"
}
],
"entityEntityRelation": [
{
"parentIdentification": "PARENT_ENTITY_001",
"typeRelation": "FRIENDS_WITH",
"childIdentification": "ENTITY_002"
}
]
},
{
"identity": "ENTITY_002",
"surname": "JACKSON",
"entityLocationRelation": [
{
"parentIdentification": "PARENT_ENTITY_002",
"typeRelation": "SEEN_AT",
"locationIdentity": "LOCATION_001"
}
]
},
{
"identity": "ENTITY_003",
"surname": "JOHNSON"
}
],
"identification": "REGISTRATION_001",
"locations": [
{
"city": "LONDON",
"identity": "LOCATION_001"
},
{
"city": "PARIS",
"identity": "LOCATION_002"
}
]
}
]
}
With these JSON's, I want to make a graph consisting of the following nodes: Registration, Entity and Location. This part I've figured out and made the following:
WITH "file:///example.json" AS json_file
CALL apoc.load.json(json_file,"$.JsonExport.*" ) YIELD value AS data
MERGE(r:Registration {id:data.identification})
WITH json_file
CALL apoc.load.json(json_file,"$.JsonExport..locations.*" ) YIELD value AS locations
MERGE(l:Locations{identity:locations.identity, name:locations.city})
WITH json_file
CALL apoc.load.json(json_file,"$.JsonExport..entities.*" ) YIELD value AS entities
MERGE(e:Entities {name:entities.surname, identity:entities.identity})
All the entities and locations should have a relation with the registration. I thought I could do this by using the following code:
MERGE (e)-[:REGISTERED_ON]->(r)
MERGE (l)-[:REGISTERED_ON]->(r)
However this code doesn’t give the desired output. It creates extra "empty" nodes and doesn't connect to the registration node. So the first question is: How do I connect the location and entities nodes to the registration node. And in light of the other JSON's, the entities and locations should only be linked to the specific registration.
Furthermore, I would like to make the entity -> location relation and the entity - entity relation and use the given type of relation (SEEN_AT or FRIENDS_WITH) as label for the given relation. How can this be done? I'm kind of lost at this point and don’t see how to solve this. If someone could guide me into the right direction I would be much obliged.
Upvotes: 0
Views: 566
Reputation: 57
So playing with neo4j/cyper I've learned some new stuff and came to another solution for the problem. Based on the given example data, the following can create the nodes and edges dynamically.
WITH "file:///example.json" AS json_file
CALL apoc.load.json(json_file,"$.JsonExport.*" ) YIELD value AS data
CALL apoc.merge.node(['Registration'], {id:data.identification}, {},{}) YIELD node AS vReg
UNWIND data.entities AS ent
CALL apoc.merge.node(['Person'], {id:ent.identity}, {}, {id:ent.identity, surname:ent.surname}) YIELD node AS vPer1
UNWIND ent.entityEntityRelation AS entRel
CALL apoc.merge.node(['Person'],{id:entRel.childIdentification},{id:entRel.childIdentification},{}) YIELD node AS vPer2
CALL apoc.merge.relationship(vPer1, entRel.typeRelation, {},{},vPer2) YIELD rel AS ePer
UNWIND data.locations AS loc
CALL apoc.merge.node(['Location'], {id:loc.identity}, {name:loc.city}) YIELD node AS vLoc
UNWIND ent.entityLocationRelation AS locRel
CALL apoc.merge.relationship(vPer1, locRel.typeRelation, {},{},vLoc) YIELD rel AS eLoc
CALL apoc.merge.relationship(vLoc, "REGISTERED_ON", {},{},vReg) YIELD rel AS eReg1
CALL apoc.merge.relationship(vPer1, "REGISTERED_ON", {},{},vReg) YIELD rel AS eReg2
CALL apoc.merge.relationship(vPer2, "REGISTERED_ON", {},{},vReg) YIELD rel AS eReg3
RETURN vPer1,vPer2, vReg, vLoc, eLoc, eReg1, eReg2, eReg3
Upvotes: 0
Reputation: 67019
Variable names (like e
and r
) are not stored in the DB, and are bound to values only within individual queries. MERGE
on a pattern with an unbound variable will just create the entire pattern (including creating an empty node for unbound node variables).
When you MERGE a node, you should only specify the unique identifying property for that node, to avoid duplicates. Any other properties you want to set at the time of creation should be set using ON CREATE SET
.
It is inefficient to parse through the JSON data 3 times to get different areas of the data. And it is especially inefficient the way your query was doing it, since each subsequent CALL/MERGE
group of clauses would be done multiple times (since every previous CALL
produces multiple rows, and the number of rows increases multiplicative). You can use aggregation to get around that, but it is unnecessary in your case, since you can just do the entire query in a single pass through the JSON data.
This may work for you:
CALL apoc.load.json(json_file,"$.JsonExport.*" ) YIELD value AS data
MERGE(r:Registration {id:data.identification})
FOREACH(ent IN data.entities |
MERGE (e:Entities {identity: ent.identity})
ON CREATE SET e.name = ent.surname
MERGE (e)-[:REGISTERED_ON]->(r)
FOREACH(loc1 IN ent.entityLocationRelation |
MERGE (l1:Locations {identity: loc1.locationIdentity})
MERGE (e)-[:SEEN_AT]->(l1))
FOREACH(ent2 IN ent.entityEntityRelation |
MERGE (e2:Entities {identity: ent2.childIdentification})
MERGE (e)-[:FRIENDS_WITH]->(e2))
)
FOREACH(loc IN data.locations |
MERGE (l:Locations{identity:loc.identity})
ON CREATE SET l.name = loc.city
MERGE (l)-[:REGISTERED_ON]->(r)
)
For simplicity, it hard-codes the FRIENDS_WITH
and REGISTERED_ON
relationship types, as MERGE
only supports hard-coded relationship types.
Upvotes: 2