Sonia
Sonia

Reputation: 1054

NoSQL DB and Reporting

I am in the architecture stage of an academic project involving billions of records. The project should be very lightweight in terms of computing power and highly scalable. The information structure is very simple: I need to store a list of items each one with different features. The feature are integers, decimals, dates, strings etc. When the data is imported the types of the feature is known. Also, features can be used to reference other items.
I need to be able to get and sort a list of items by its features (more than one) - possibly using queries such as >, <, =, and regexes, length, left, right, mid for strings between the feature values and against user arbitrary input.

Reporting in the sense of sums, averages, grouping is also necessary by the demands for that are more relaxed - there is not need for a full cube capabilities, but more are better.

I am very new to the whole NoSQL world. What would you recommend?.

Upvotes: 3

Views: 4518

Answers (2)

Ravindra
Ravindra

Reputation: 353

If you are going to use aggregates then you could use map reduce to populate aggregate tables and then serve that data.

Writing map reduce for every query may be cumbersome, you can also have a look at Apache Pig and Hive. This is especially helpful for the kindly of adhoc queries you are talking about.

Upvotes: 0

Ken Downs
Ken Downs

Reputation: 4827

If you check out the tutorials for MongoDB, they have, in my opinion, the best introduction to the Map/Reduce system that is used to query and aggregrate.

I do wonder though why you have concluded in advance that NoSQL is the route to go. Although different items may have different schemas, are there a fixed number of entities and attributes, and why have you (if you have) ruled out SQL, which, after all, has decades of accumulated features for storing and querying data.

Upvotes: 4

Related Questions