amnesia
amnesia

Reputation: 1988

Advice on schema design - MongoDB or potentially Neo4J

I am building a simple accounting program for personal use (MEAN stack). I reached for Mongo because I'm familiar with it, but as I'm fleshing out the schemas I'm having doubts. I want to present my thoughts and ask for suggestions on possibly another way of doing it.

Since the same transaction has to show on the ledger of an arbitrary number of accounts, my current thought was to not store the transactions as subdocuments of the accounts, but as their own collection (to avoid duplicating them). Leaving out the unimportant bits, it would look something like this:

var accountSchema = new mongoose.Schema({
    name: String,
    // etc
});

var entrySchema = new mongoose.Schema({
    amount: Number,
    account: {type: mongoose.Schema.Types.ObjectId, ref: 'Account'}
});

var transactionSchema = new mongoose.Schema({
    date: Date,
    debits: [entrySchema],
    credits: [entrySchema]
});

While this seems to me a logical way to store the data, there's some obvious query concerns. When I want to view the ledger for an account, for example, I would have to iterate all transactions for the period, and for each one iterate both credit and debit collections to check if that account was involved.

I don't have any experience with graph dbs, but I was thinking that something like Neo4J might be better suited to query this type of data. My question is, do you agree or is Mongo still a good choice but I'm thinking about the schemas wrong?

Upvotes: 1

Views: 197

Answers (1)

Bruno Peres
Bruno Peres

Reputation: 16355

Yes, you are right. Neo4j is a good choice when dealing with connected data.

Your current strategy is basically embedding references to identifiers from other schemas (a kind of foreign key). At query time it will require full scans an join-like operations, becoming very expensive as the size of your database increases.

Also, you will need care about updates and deletions of these referenced data. Otherwise, you will have inconsistent data.

With a graph database like Neo4j your data model will be a graph, that is more "natural" and intuitive for your scenario, I think. I'm not completely understood your requirements and scenario but I believe that the graph data model can be something like:

Sample graph model

This way you can use the power of Cypher Language to query your graph. For example, to get all credits from a given account since "2017-10-10" you can do:

MATCH(a:Account)<-[:ENTRY_TO]-(e:Entry)<-[:CREDIT]-(t:Transaction)
WHERE a.id = 10 AND t.date > "2017-10-10"
RETURN e

You can take advantage of the index-free adjacency to transverse and query the graph with a cheaper cost because no join operations are needed.

Since you are not familiar with Neo4j and graph databases, I suggest you take a look in the free online training getting started with Neo4j. Also, you can download the e-books: Graph Databases (By Ian Robinson, Jim Webber, and Emil Eifrém) and Learning Neo4j (By Rik Van Bruggen) for free.

Upvotes: 2

Related Questions