Tina
Tina

Reputation: 312

Analytics implementation in hadoop

Currently, we have mysql based analytics in place. We read our logs after every 15 mins, process them & add to mysql database.

As our data is growing(In one case, 9 million rows added till now & 0.5 million rows are adding in each month), we are planning to move analytics to no sql database.

As per my study, Hadoop seems to be better fit as we need to process the logs & it can handle very large data set.

However, it would be great if I can get some suggests from experts.

Upvotes: 0

Views: 70

Answers (2)

Hussnain
Hussnain

Reputation: 186

I agree with the other answers and comments. But if you want to evaluate Hadoop option then one solution can be following.

  • Apache Flume with Avro for log collection, agregation. Flume can ingest data into Hadoop File System (HDFS)
  • Then you can have Hbase as distributed scalable data store.
  • with Cloudera Impala on top of hbase you can have a near to real time (streaming) query engine. Impala uses SQL as its query language so it will be beneficial for you.

This is just one option. There can be multiple alternatives e.g. flume + hdfs + hive.

Upvotes: 1

Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz

Reputation: 25919

This is probably not a good q. for this forum but I would say that 9 million row and 0.5m per month hardly seems like a good reason to go to noSQL. This is a very small database and your best action would be to scale up the server a little (RAM, more disks, move to SSDs etc.)

Upvotes: 0

Related Questions