Reputation: 101
My project is Log Analysis(Developing a new tool for log analysis ,existing tool such as appache log viewer)using C++ Qt programming.There are different log files are available,each file have different format.In my project first extract different fields from the log file and second analysis.I am choosing a NoSQL databse MongoDB for this application.I have a doubt it is suitable or not for this application? I have no idea about MOngoDB.In Qt programming any problem with MongoDB?
Upvotes: 1
Views: 1592
Reputation: 4203
I guess it depends on how you are gonna use these logs, and how many logs are you going to store. You mentioned 2 purposes of your application:
For the 1st one, it's totally OK. Comparing to traditional RDBMS, MongoDB's advantages of storing data are:
Talking about analyse however, is not MongoDB's strong point. Read the discussion here. MongoDB has the ability to distribute data and uses a group of servers to analyse it, which makes it possible to analyse large amount of data which is impossible for RDBMS to deal with. But it doesn't mean it's going to be faster. Currently MongoDB's Map/Reduce has it's own limitations. And of course you can add more sharding servers to make it faster. Can be costy though.
Another problem is currently MongoDB doesn't support full-text search (It's a new feature in the upcoming version 2.6, but not now). Then if you want to search by keywords, it's going to be slow.
The 2 issues I talked about above is based on the fact that you are going to use the distribute features provided by MongoDB. If you are not, you can use C++ to enumerate the log and analyse the records one by one. In this case, MongoDB provides some very good features called capped collection and TTL index, which may save you some time removing expired data. Read the documents for more information.
In conclusion, well, there's actually no conclusion. What you choose depends on what you are going to do and how you are gonna do it. Mind providing more information so that we can go further?
Upvotes: 1
Reputation: 5118
In your case, one advantage of MongoDB and other document stores (over, say, a simple key-value store) is that it allows you to have structured data in each log document, providing a soft schema of sorts, i.e. a schema that you can efficiently modify once you already have data in your store, should some new fields be made available by new input log formats. Document stores also allow you to efficiently query data based on individual fields, like you would do with a RDBMS.
However, your data is append-only (since it's log data, newer data does not invalidate old data), which has performance implications: theoretically, writing new data should not block reading existing data. MongoDB's concurrency mechanisms do not support this behavior, since the locking is done per-database: http://docs.mongodb.org/manual/faq/concurrency/ Theoretically, thus, another DB system with a more granular approach to locking may be more efficient at handling simultaneous reads and writes.
The complete performance analysis depends on a lot more, including your data set and queries, so this may be irrelevant in practice. Basically, you would need to test
A question (unfortunately unanswered) approaching this topic can be found here: Which NoSQL database best for append only audit logging use case?
Upvotes: 1