Reputation: 3989
I wasn't sure if this question was more appropriate here or at dba.stackexchange.com, so let me know and I can move it if needed.
I'm starting a brand new project from scratch and am researching the appropriateness of a graph database (in this case likely neo4j). However, the system is anticipated to have data appropriate for a graph database (like an HR hierarchy), as well as data that would be better in an RDBMS (like orders, invoices, products, etc).
My question is concerning the best way to join across these 2 databases. Here's a simplified example where we have an HR hierarchy of a sales company in the graph database, and we have products and orders in an RDBMS:
Hierarchy (graph database)
Orders/Products (rdbms)
Let's say we wanted to find all orders commissionable to a sales employee at some level UNDER "state1".
You could use 2 queries -- 1 to pull all sales employees from the graph, and a 2nd to pull all orders for those employees from the RDBMS:
(pseudo-code)
MATCH (closers:employee)-[member_of]->(:group)<-[parent_of]-(parent:group)
WHERE parent.name = 'state 1'
RETURN closers;
then pipe those results into another query:
(pseudo-code)
SELECT * FROM orders WHERE salesEmployeeId IN ( <resultId1>, <resultId2>, ... <resultIdN> );
This works, but it strikes me as particularly inefficient at scale, and somewhat inelegant. What I would LIKE to do is be able to join DIRECTLY to the rdbms. Is there a way to achieve something like this?
(pseudo code)
MATCH (closers:employee)-[member_of]->(:group)<-[parent_of]-(parent:group)
WHERE parent.name = 'state 1'
JOIN rdbms.orders ON orders.salesEmployeeId = closers.id
RETURN orders;
EDIT: it's been pointed out that the orders section of this question could easily be implemented in a graph database so the RDBMS isn't needed at all. That's definitely an option I've considered, but at this point, I'm still trying to figure out how well a graph database fits on the whole. I'm more interested in the larger, more general question of, "If things are in both systems, then is there a good/fast/elegant way to do this with a single query?"
Upvotes: 4
Views: 288
Reputation: 67044
Based on the info in your question, it seems to me that your order data is perfectly suitable for representation in a Graph DB.
Suppose you extended your Graph DB model by adding the following (relationships types are left out for clarity):
(e:Employee {id: 123, name: "Foo"})-->(o:Order {id: 234, frozenPrice: 10.99})
(o)-->(c:Customer {id: 345})
(o)-->(i:Item {desc:"Bar", currentPrice:12.99})
Then, you can perform your desired query very simply like this:
MATCH (o:Order)<--(closers:Employee)-[member_of]->(:Group)<-[parent_of]-(parent:Group)
WHERE parent.name = 'state 1'
RETURN orders;
This should be faster, and is certainly much less complex (and bug-prone) than attempting to do a single query using 2 different DBs.
Upvotes: 3