Faaiz
Faaiz

Reputation: 685

Loading json into Neo4j Using "apoc.load.json" : Incomplete data retrieval

I am trying to retrieve json data from mongodb collection and create it as label in neo4j. I am using mongo REST API to retrieve json from mongodb. I followed link text by William Lyon on "Polyglot Persistence with MongoDB and Neo4j".

Problem is that i can not load all the data. Following is the format of json data.

> db.cves.findOne()
{
        "_id" : ObjectId("5a37226550eb46004dea39b0"),
        "vulnerable_configuration_cpe_2_2" : [
                "cpe:/o:bsdi:bsd_os:3.1",
                "cpe:/o:freebsd:freebsd:1.0",
                "cpe:/o:freebsd:freebsd:1.1",
                "cpe:/o:freebsd:freebsd:1.1.5.1",
                "cpe:/o:freebsd:freebsd:1.2",
                "cpe:/o:freebsd:freebsd:2.0",
                "cpe:/o:freebsd:freebsd:2.0.1",
                "cpe:/o:freebsd:freebsd:2.0.5",
                "cpe:/o:freebsd:freebsd:2.1.5",
                "cpe:/o:freebsd:freebsd:2.1.6",
                "cpe:/o:freebsd:freebsd:2.1.6.1",
                "cpe:/o:freebsd:freebsd:2.1.7",
                "cpe:/o:freebsd:freebsd:2.1.7.1",
                "cpe:/o:freebsd:freebsd:2.2",
                "cpe:/o:freebsd:freebsd:2.2.2",
                "cpe:/o:freebsd:freebsd:2.2.3",
                "cpe:/o:freebsd:freebsd:2.2.4",
                "cpe:/o:freebsd:freebsd:2.2.5",
                "cpe:/o:freebsd:freebsd:2.2.6",
                "cpe:/o:freebsd:freebsd:2.2.8",
                "cpe:/o:freebsd:freebsd:3.0",
                "cpe:/o:openbsd:openbsd:2.3",
                "cpe:/o:openbsd:openbsd:2.4"
        ],
        "impact" : {
                "integrity" : "NONE",
                "availability" : "PARTIAL",
                "confidentiality" : "NONE"
        },
        "vulnerable_configuration" : [
                "cpe:2.3:o:bsdi:bsd_os:3.1",
                "cpe:2.3:o:freebsd:freebsd:1.0",
                "cpe:2.3:o:freebsd:freebsd:1.1",
                "cpe:2.3:o:freebsd:freebsd:1.1.5.1",
                "cpe:2.3:o:freebsd:freebsd:1.2",
                "cpe:2.3:o:freebsd:freebsd:2.0",
                "cpe:2.3:o:freebsd:freebsd:2.0.1",
                "cpe:2.3:o:freebsd:freebsd:2.0.5",
                "cpe:2.3:o:freebsd:freebsd:2.1.5",
                "cpe:2.3:o:freebsd:freebsd:2.1.6",
                "cpe:2.3:o:freebsd:freebsd:2.1.6.1",
                "cpe:2.3:o:freebsd:freebsd:2.1.7",
                "cpe:2.3:o:freebsd:freebsd:2.1.7.1",
                "cpe:2.3:o:freebsd:freebsd:2.2",
                "cpe:2.3:o:freebsd:freebsd:2.2.2",
                "cpe:2.3:o:freebsd:freebsd:2.2.3",
                "cpe:2.3:o:freebsd:freebsd:2.2.4",
                "cpe:2.3:o:freebsd:freebsd:2.2.5",
                "cpe:2.3:o:freebsd:freebsd:2.2.6",
                "cpe:2.3:o:freebsd:freebsd:2.2.8",
                "cpe:2.3:o:freebsd:freebsd:3.0",
                "cpe:2.3:o:openbsd:openbsd:2.3",
                "cpe:2.3:o:openbsd:openbsd:2.4"
        ],
        "cvss" : 5,
        "references" : [
                "http://www.openbsd.org/errata23.html#tcpfix"
        ],
        "Modified" : ISODate("2010-12-16T00:00:00Z"),
        "summary" : "ip_input.c in BSD-derived TCP/IP implementations allows remote attackers to cause a denial of service (crash or hang) via crafted packets.",
        "cwe" : "CWE-20",
        "Published" : ISODate("1999-12-30T00:00:00Z"),
        "cvss-time" : ISODate("2004-01-01T00:00:00Z"),
        "access" : {
                "vector" : "NETWORK",
                "authentication" : "NONE",
                "complexity" : "LOW"
        },
        "id" : "CVE-1999-0001"
}
>

Following query works fine

//Load CPE collection from CVEDB of mongodb as CVE Label
CALL apoc.load.json('http://127.0.0.1:28017/cvedb/cves/') YIELD value
UNWIND value.rows as cveData
MERGE (c:CVE{_id:cveData._id['$oid']})
ON Create set c.id=cveData.id, c.cvss = cveData.cvss

OUTPUT:

Added 1000 labels, created 1000 nodes, set 2970 properties, completed after 462 ms.

Question:

> db.cves.count();
99022

There are 99022 record in collection, why only 1000 nodes are created and not 99022 in my cypher query?

Thanks

Upvotes: 0

Views: 477

Answers (2)

Michael Hunger
Michael Hunger

Reputation: 41706

can you provide:

CALL apoc.load.json('http://127.0.0.1:28017/cvedb/cves/') YIELD value
RETURN count(*);

CALL apoc.load.json('http://127.0.0.1:28017/cvedb/cves/') YIELD value
UNWIND value.rows as cveData
RETURN count(*);


CALL apoc.load.json('http://127.0.0.1:28017/cvedb/cves/') YIELD value
UNWIND value.rows as cveData
RETURN count(distinct cveData._id['$oid']);

also

CALL apoc.load.json('http://127.0.0.1:28017/cvedb/cves/') YIELD value
UNWIND value.rows as cveData
RETURN cveData._id LIMIT 1;

Upvotes: 0

cybersam
cybersam

Reputation: 67019

Some possible reasons why only 1000 nodes were created:

  1. MERGE would not create a new node if a matching one already existed. So maybe you already had some matching nodes.
  2. If multiple CVEs have the same cveData._id['$oid'] value, then at most one of those CVEs would create a new node.

However, I do not know why only 2970 properties were set if 1000 nodes were created. I would have expected 3000 properties to be set, given your Cypher code.

Upvotes: 1

Related Questions