London guy
London guy

Reputation: 28032

Which version of Hadoop API to use

There are several versions of Hadoop APIs that are available as part of Cloudera and Yahoo distributions. Furthermore, for Cloudera there is cdh3u1 to cdh3u4 versions.

I saw that the API methods also change in the way they are named and the parameters they accept.

Which version of Hadoop API, and from where, can I use that is latest and stable?

Upvotes: 1

Views: 437

Answers (1)

Praveen Sripati
Praveen Sripati

Reputation: 33555

Which version of Hadoop API, and from where, can I use that is latest and stable?

First thing to note that the latest and stable API don't go together. It takes some time for the latest API to become rock solid, with all the bugs found out and fixed.

If you are interested in packaged software, then go to Cloudera and download a stable or an alpha version and try it out. For HortonWorks you can download HDP 1.0 which is the only version available. Cloudera has been releasing CDH close to 4 years on a regular basis, so it is more mature compared to HDP from HortonWorks. CDH has got the next generation MapReduce included, while HDP has got the legacy MapReduce architecture.

The above mentioned packages (CDH and HDP) have a set of frameworks well integrated and tested. So, it's matter of learning how to use the frameworks. There is no need to worry about the interoperability issues across different frameworks.

If you wanted to really learn about Hadoop, I would suggest to download the software from Apache Hadoop and then go ahead with the installation and configuration. The same applies for Pig, Hive and other softwares also. You might find out some compatibility issues, which have to be resolved as you go on.

In the Apache Hadoop space, there is 1x track which has the stable legacy MR architecture and then the 2x track which has the next generation MapReduce architecture.

Upvotes: 1

Related Questions