How to structure graph database to represent a relational database?

Question

I understand that there is no point in trying to engineer a relational database into a graph database. They are both inherently different and thus their data should be organized differently.

However, I'm building a database documentation app using Neo4j. In this app, a user would be able to upload tables from a SQL database. One feature is that a user would be able to preview the first 10 rows of any table they choose.

My problem is: I don't know how to structure a relational database in my Neo4j database, such that I could write Cypher query to print out a table in relational format.

So far I tried:

(Database)-[HAS_TABLE]->(Ex_Table)
                                 `–[HAS_FIELD]->(id) ––––––[HAS_CELL]->(1)
                                 `–[HAS_FIELD]->(name)–––––[HAS_CELL]->(bob)
                                 `–[HAS_FIELD]->(password)–[HAS_CELL]->(pass)

and

(Database)-[HAS_TABLE]->(Ex_Table)
                                 `–[HAS_FIELD]->(id) –––––––\
                                 `–[HAS_FIELD]->(name)–––(HAS_ROW)->({id:1,name:bob,password: pass})
                                 `–[HAS_FIELD]->(password)––‘

Question: Are these correct? If so, how could I query Ex_Table using Cypher to print the first 10 rows? If not, how should I structure and query to get that result?

Note: I have to use a graph database.

Marj · Accepted Answer

I think you need more meta data round your graph to really make this work. I would adopt this approach to storing your data:

create pth= (d:Database {name: 'MyDB'})-[:HAS_TABLE]->(t:Table {name: 'Users'}),
(t)-[:HAS_COLUMN {order: 1}]->(c1:Column {name: 'Id'}),
(c1)-[:HAS_VALUE {row: 1}]->(v1:Value {value: '1'}),
(c1)-[:HAS_VALUE {row: 2}]->(v2:Value {value: '2'}),
(t)-[:HAS_COLUMN {order: 2}]->(c2:Column {name: 'Name'}),
(c2)-[:HAS_VALUE {row: 1}]->(v3:Value {value: 'Bob'}),
(c2)-[:HAS_VALUE {row: 2}]->(v4:Value {value: 'Jim'}),
(t)-[:HAS_COLUMN {order: 3}]->(c3:Column {name: 'Password'}),
(c3)-[:HAS_VALUE {row: 1}]->(v5:Value {value: 'password1'}),
(c3)-[:HAS_VALUE {row: 2}]->(v6:Value {value: 'SafePassword'})

This will give you a graph that looks like this:

Now you can retrive your table structure with a query like this:

match (t:Table {name: 'Users'})-[r:HAS_COLUMN]->(c:Column)
return t.name, c.name
order by r.order

To retrieve the sample rows for a table use a query like:

// Get the column names as a list
match (t:Table {name: 'Users'})-[r1:HAS_COLUMN]->(c:Column)
with 0 as rownum, r1.order as colnum, c.name as val order by colnum
with rownum, collect(val) as vals
return rownum, vals
union // will append the results of the 2nd match to the first
// Get the sample data values
match (t:Table {name: 'Users'})-[r1:HAS_COLUMN]->(c:Column)-[r2:HAS_VALUE]->(v:Value)
with r1.order as colnum, r2.row as rownum, v.value as val order by colnum
with rownum, collect(val) as vals order by rownum
return rownum, vals

which will collect the value nodes into the correct order by row.

How to structure graph database to represent a relational database?

Answers (1)

Related Questions