Jed Christiansen
Jed Christiansen

Reputation: 669

py2neo not enforcing uniqueness constraints in Neo4j database

I have a neo4j database with nodes that have labels "Program" and "Session". In the Neo4j database I've enforced a uniqueness constraint on the properties: "name" and "href". From the :schema

Constraints
ON (program:Program) ASSERT program.href IS UNIQUE
ON (program:Program) ASSERT program.name IS UNIQUE
ON (session:Session) ASSERT session.name IS UNIQUE
ON (session:Session) ASSERT session.href IS UNIQUE

I want to periodically query another API (thus storing the name and API endpoint href as properties), and only add new nodes when they're not already in the database.

This is how I'm creating the nodes:

newprogram, = graph_db.create(node(name = programname, href = programhref))
newprogram.add_labels('Program')

newsession, = graph_db.create(node(name = sessionname, href = sessionhref))
newsession.add_labels('Session')

I'm running into the following error:

Traceback (most recent call last):
  File "/Users/jedc/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)
  File "/Users/jedc/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)
  File "/Users/jedc/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/Users/jedc/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "/Users/jedc/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/Users/jedc/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "/Users/jedc/appfolder/applicationapis.py", line 42, in post
    newprogram.add_labels('Program')
  File "/Users/jedc/appfolder/py2neo/util.py", line 99, in f_
    return f(*args, **kwargs)
  File "/Users/jedc/appfolder/py2neo/core.py", line 1638, in add_labels
    if err.response.status_code == BAD_REQUEST and err.cause.exception == 'ConstraintViolationException':
AttributeError: 'ConstraintViolationException' object has no attribute 'exception'

My thought was that if I try to add the nodes and they're already in the database they just won't be added.

I've done a try/except AttributeError block around the creation/add_labels lines, but when I did that I managed to duplicate everything that was already in the database, even though I had the constraints shown. (?!?) (How can py2neo manage to violate those constraints??)

I'm really confused, and would appreciate any help in figuring out how to add nodes only when they don't already exist.

Upvotes: 1

Views: 503

Answers (2)

Nigel Small
Nigel Small

Reputation: 4495

Firstly, the stack trace that you've shown highlights a a bug that should be fixed in the latest version of py2neo (1.6.4 at the time of writing this). There was an issue whereby error detail dropped an expected "exception" key and this has now been fixed so upgrading should give you a better error message.

However, this only addresses the error reporting bug. In terms of the constraint question itself, it is correct that the node creation and application of labels are necessarily carried out in two steps. This is due to a limitation in the REST API that does not allow a direct method for creating a node with label detail.

The next version of py2neo will make this easier/possible in a single step via batching. But for now, you probably want to look at a Cypher statement to carry out the creation and labelling as mentioned in the other answer here.

Upvotes: 1

stephenmuss
stephenmuss

Reputation: 2445

The problem seems to be that you are first creating nodes without a label and then subsequently adding the label after creation.

That is

graph_db.create(node(name = programname, href = programhref))

and

graph_db.create(node(name = sessionname, href = sessionhref))

This, first creates nodes without any labels which means the nodes satisfy the constraint conditions which only apply to nodes with the labels Program and Session.

Once you call newprogram.add_labels('Program') and newsession.add_labels('Session') Neo4j attempts to add labels to the node and raises an exception since the constraint assertions cannot be met.

Py2neo may be creating duplicate nodes. Although I'm sure if you inspect them, you'll find one set of nodes has the labels and the other set does not.

Can you use py2neo in a way that it adds the label at the same time as creation?

Otherwise you could use a Cypher query

CREATE (program:Program{name: {programname}, href: {programhref}})
CREATE (session:Session{name: {sessionname}, href: {sessionhref}})

Using Py2neo you should be able to do this as suggested in the docs

graph_db = neo4j.GraphDatabaseService()
qs = '''CREATE (program:Program{name: {programname}, href: {programhref}})
        CREATE (session:Session{name: {sessionname}, href: {sessionhref}})'''
query = neo4j.CypherQuery(graph_db, qs)
query.execute(programname=programname, programhref=programhref,
              sessionname=sessionname, sessionhref=sessionhref)

Upvotes: 3

Related Questions