Reputation: 235
I am creating a huge graph database with over 1.4 million nodes and 160 million relationships. My code looks as follows:
from py2neo import neo4j
# first we create all the nodes
batch = neo4j.WriteBatch(graph_db)
nodedata = []
for index, i in enumerate(words): # words is predefined
batch.create({"term":i})
if index%5000 == 0: #so as not to exceed the batch restrictions
results = batch.submit()
for x in results:
nodedata.append(x)
batch = neo4j.WriteBatch(graph_db)
results = batch.submit()
for x in results:
nodedata.append(x)
#nodedata contains all the node instances now
#time to create relationships
batch = neo4j.WriteBatch(graph_db)
for iindex, i in enumerate(weightdata): #weightdata is predefined
batch.create((nodedata[iindex], "rel", nodedata[-iindex], {"weight": i})) #there is a different way how I decide the indexes of nodedata, but just as an example I put iindex and -iindex
if iindex%5000 == 0: #again batch constraints
batch.submit() #this is the line that shows error
batch = neo4j.WriteBatch(graph_db)
batch.submit()
I am getting the following error:
Traceback (most recent call last):
File "test.py", line 53, in <module>
batch.submit()
File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2116, in submit
for response in self._submit()
File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2085, in _submit
for id_, request in enumerate(self.requests)
File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 427, in _send
return self._client().send(request)
File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 351, in send
rs = self._send_request(request.method, request.uri, request.body, request.$
File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 326, in _send_re$
data = json.dumps(data, separators=(",", ":"))
File "/usr/lib64/python2.6/json/__init__.py", line 237, in dumps
**kw).encode(obj)
File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode
chunks = list(self.iterencode(o))
File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode
for chunk in self._iterencode_list(o, markers):
File "/usr/lib64/python2.6/json/encoder.py", line 204, in _iterencode_list
for chunk in self._iterencode(value, markers):
File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
File "/usr/lib64/python2.6/json/encoder.py", line 317, in _iterencode
for chunk in self._iterencode_default(o, markers):
File "/usr/lib64/python2.6/json/encoder.py", line 323, in _iterencode_default
newobj = self.default(o)
File "/usr/lib64/python2.6/json/encoder.py", line 344, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: 3448 is not JSON serializable
Could anybody please suggest me what exactly is happening here and how can I overcome it? Any kind of help would be appreciated. Thanks in advance! :)
Upvotes: 0
Views: 606
Reputation: 4495
It's hard to tell without being able to run your code with the same data set but this is likely to be caused by the type of the items in weightdata
.
Step through your code or print the data type as you go to determine what the type of i
is within the {"weight": i}
portion of the relationship descriptor. You may find that this is not an int
- which would be required for JSON number serialisation. If this theory is correct, you will need to find a way to cast or otherwise convert that property value into an int
before using it in a property set.
Upvotes: 1
Reputation: 547
I've never used the p2neo, but if I look at the documentation
This:
batch.create((nodedata[iindex], "rel", nodedata[-iindex], {"weight": i}))
Is missing the rel() Part:
batch.create(rel(nodedata[iindex], "rel", nodedata[-iindex], {"weight": i}))
Upvotes: 1