Reputation: 3662
Cypher query language has been made popular with Neo4J and is being standardized as openCypher. The openCypher page mentions SQL/GQL but the GQL page has last been updated in 2019. Meanwhile the SQL:2023 standard includes a section on Property Graph queries called SQL/PGQ. Unfortunately SQL:2003 is not public so I can only guess about details. Then there is PGQL, driven by Oracle, so PGQL seems to be the same as SQL/PGQ and its syntax looks like a subset of Cypher but I would not be surprised to find expressions in each of the language that cannot be used in the other languages.
I suppose very simple query expressions such as MATCH (a:b {c:42})
can safely be used across all of these languages but what about are more complex properties, quoted strings, list values, return types, limits etc.? Is there a safe and formally defined subset of Cypher queries with same semantics across these query languages?
Upvotes: 1
Views: 792
Reputation: 46
I am a developer at Oracle and have been leading the PGQL effort and am lately focusing on the SQL/PGQ implementation in Oracle Database.
The example you gave (MATCH (a:b {c:42})
) is valid Cypher but is not valid SQL or PGQL.
SQL has standardized this as MATCH (a IS b WHERE a.c = 42)
.
Here, a
is a vertex variable declaration, b
is a label expression introduced by the keyword IS
, and a.c = 42
is a search condition that references property c
of vertex a
.
Is there a safe and formally defined subset of Cypher queries with same semantics across these query languages?
There is not a single (full) query that is valid in all these languages, if only for the fact that SQL queries use SELECT
(or COLUMNS
) where Cypher uses RETURN
and WITH
.
But when only focusing on graph pattern matching there is significant overlap between all these languages and differences are mostly minor and syntactical so that it becomes simple to migrate between them. BTW, the SQL/PGQ specification is technically public, but accessing it indeed requires a fee. The same will be the case for GQL once it is published here.
But different vendors will publish the documentation of their own implementations. Here are the best references for Oracle Database:
Outside of graph pattern matching there is not a lot of overlap between the languages.
In SQL, since graphs are view-like objects on top of tables, users perform INSERT
/UPDATE
/DELETE
operations against underlying tables of the graph. For example, you can query a graph and insert into its underlying tables like this: INSERT INTO ... FROM GRAPH_TABLE ( my_graph MATCH ... )
In SQL, properties are statically typed but there is JSON (and XML) to handle semi-structured data and it can easily be combined with property graphs. For example, JSON dot-notation access inside a property graph query looks like this: MATCH (v IS person) WHERE v.address.street_name = 'Monroe Avenue'
.
SQL includes many important database functionalities that are not specific to graphs but nevertheless useful if not essential to users of graphs: privileges, constraints, triggers, views, rich set of data types and accompanying expressions and predicates, etc. From our project experience, much of the data preparation required to create a good graph model is very efficiently done in SQL. A fair amount of this functionality is lacking from languages that haven't been around as long as SQL.
For PGQL, many of the language constructs from SQL/PGQ have already been added to Oracle’s implementation, while the original specification still works to ensure backwards compatibility. The plan is to allow seamless migration between the two platforms.
Upvotes: 3
Reputation: 86
This is a very timely question.
I am the product manager for Cypher @ Neo4j. In my answer below, I tried to be impartial, but I declare my affiliation so that readers can make up their minds.
GQL is supposed to address the problem of having a common query language for property graphs. The work on the GQL standard is very close to an end, and the standard will be announced by ISO very soon (as far as I know, it should be out by the end of May 2024 at the latest). Having a standard published and having it widely supported is not the same thing. I expect it will take some time before several complete implementations are available. Meanwhile, I believe that the most common language many implementations support is openCypher. openCypher is also quite close to the upcoming GQL so in my opinion is your best bet.
GQL is a native property graph query language. SQL/PGQ, on the other hand, extends SQL to support some graph queries. It has several limitations. Most notably, it only supports read queries: you write tables and read graphs. I assume it will be of interest mostly to non-native graph databases, specifically solutions based on relational databases. Even when/if it is supported by a large number of (relational?) databases, it includes a quite reduced subset of graph query language features.
The good news is that ISO developed both SQL/PGQ and GQL in parallel. By design, the two standards share the parts that a native graph query language and SQL can share; in particular, the path pattern language (the part of the query that goes after the MATCH keyword) is the same. But expect GQL to be a much richer graph query language. This paper talks in more detail about the overlap between the two standards.
I am not an expert on PGQL, so I leave that part of the question to others
Upvotes: 3
Reputation: 2759
You have the basic gist of it, yes. The idea with SQL/PGQ is to bring the Cypher MATCH
graph "motifs" into the SQL query langauge. As of today, there's only one relational database engine that supports this in a GA release and that is Oracle (23.1). They don't yet support the entire SQL:2023 specification for SQL/PGQ, but you can get an idea of how these types of queries can be expressed via their docs: https://docs.oracle.com/en/database/oracle/property-graph/23.1/spgdg/sql-graph_table-queries.html
SQL/PGQ and PGQL are not the same. PGQL predates SQL/PGQ. Oracle actually separates these in their docs and refers to SQL/PGQ as the syntax for "SQL Property Graphs". There's a separate section in the docs link above that mentions transitioning from PGQL to SQL/PGQ.
Upvotes: 1