Reputation: 492
I'm busy creating a simple DBTester program with a Data Access Layer that can Test and Compare multiple (kind of) Databases. Currently I have implemented the Add(Insert) for SQL Server and MarkLogic NoSQL.
To my supprise inserting/adding 1 M Person Entities takes much more time using MarkLogic XCC/.Net XQuery than using SQL Server 2008 R2. SQL Server takes a few minutes of which 11 seconds in 11654 ms in Data Access Layer. MarkLogic 8 is still busy at 15621 entities in 15+ minutes!
I'm new to NoSQL MarkLogic and XCC/XQuery and maybe doing something wrong. My test code for MarkLogic can be found at GitHub: https://github.com/driekus77/DBTester/blob/master/DBTester/DataAccessLayer/Repository/MarkLogic/PersonRepository.cs#L48
The corresponding SQLServer Add code can be found at: https://github.com/driekus77/DBTester/blob/master/DBTester/DataAccessLayer/Repository/SQLServer/PersonRepository.cs#L64
So what am I doing wrong? Should I use MarkLogic RestAPI direct? Should I use JSON in stead off XML? Are there ways to speed up my XQuery Add call?
Thanks for any help!
Upvotes: 0
Views: 130
Reputation: 3732
Relational and Document and "NoSQL" databases as so fundamentally different that comparisons of this sort are misleading at best. They are different because they focus on different problems and use cases and optimize for those in different ways. A 'classic' example is comparing GC based, vs Reference counting vs explicit memory management languages. E.g. a GC based application even using less efficient algorithms can outperform a lower level manual memory managed language -- simply because the GC can be deferred past the interesting part of the application -- sometimes forever (the app exists before it needs to GC). One could debate both sides of that depending on whats important to you.
I suggest a more useful performance comparison is total application responsiveness, or throughput, or some measure of the 'whole picture' that's important to you and after you optimize the application for the particular use cases and technology. As noted, ML does significantly more work 'up front' then Relational or 'traditional' NoSQL database. If your app is a "WOM" (Write Only Memory) use case then writing to /dev/null will be even faster. When it comes time to do complex queries, document creation, large datasets etc that 'up front' is already done and neither your code nor the server has to work as hard. Similarly with data modeling -- if you start with a data model optimized for a RDBMS, it may not be ideal for a non-RDBMS engine - and visa versa.
I suggest starting with a smaller set of data at first and work through a POC of a common use case for the app as a whole. The data model is fundamental to success (or failure) with any database and application. From your application model perspective, what does the 'business objects' 'look like' ? For a NoSQL type DB try to model that as directly as possible. That will lead you in the right direction for performance as well as development/coding. At that point performance measurements and optimization strategies are much more useful and comparable.
Upvotes: 3
Reputation: 7770
See my original comment. IN addition, I also note that you are inserting items into a single 'persons' CML document. That is not what MarkLogic prefers. Each person is meant to be a separate record. Otherwise, - because it is a transactional database, each of your insert-child calls is blocking as it is the same document.
Upvotes: 3