azdatasci
azdatasci

Reputation: 841

Apache Bigtop Installation on RHEL 7

I'm seeking some help, I have been tasked with standing up a Hadoop cluster at work. I have done single node stuff on laptops at home with the open source stack (I am trying to stick with the open source, Apache stack to avoid any licensing costs. Right now we have no interest in Cloudera or HortonWorks.).

I came across the Apache BigTop stack (1.2.0) and poked around in there. Right now I am still trying to wrap my head around what this provides (I have not found a reference to Hadoop/Spark versions, etc..). Could I get some help on the following:

  1. What versions of Hadoop/Spark/other tools does the 1.2.0 version provide?

  2. Is there a good reference on installing a full Hadoop/Spark cluster from scratch under RHEL 7? I have 12 servers, I plan on doing 2 namenodes and 10 datanodes. Is BigTop appropriate for this, or should I just install each package and configure manually?

  3. I found the following:

https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+1.2.0

Which looks promising, but its for CentOS 7, which I know is similar, but not exactly the same. Can someone suggest how I can modify this to work under RHEL 7? I found repos, but none for RHEL....

  1. The documentation seems pretty slim on the official Apache page, or maybe I'm just not looking in the right spot... Are there good links to references out there for a full cluster install?

Thanks to all who can help, I really appreciate it!

Upvotes: 0

Views: 1799

Answers (1)

Evans Ye
Evans Ye

Reputation: 126

What versions of Hadoop/Spark/other tools does the 1.2.0 version provide?

Checkout our doc for 1.2.0 release:

https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+1.2.0+Release

You'll get hadoop 2.7.3 and spark 2.1.0 out-of-the-box. We've provided installable artifacts on S3 for you to test out the functionality

https://www.apache.org/dist/bigtop/bigtop-1.2.0/repos/centos7/bigtop.repo

NOTE: we'll have a S3 migration effectively on 10/15, 2017. We'll have corresponding changes afterwards. If you'd like to try it out ASAP. Please change the baseurl to:

http://repos.bigtop.apache.org/releases/1.2.0/centos/7/x86_64

Is there a good reference on installing a full Hadoop/Spark cluster from scratch under RHEL 7? I have 12 servers, I plan on doing 2 namenodes and 10 datanodes. Is BigTop appropriate for this, or should I just install each package and configure manually?

RHEL and CentOS should be very much similar. I suggest:

  • Try our CentOS packages directly on RHEL and see if that works. I've used Bigtop CentOS 6 packages on RHEL 6 in Production and it works like a charm.
  • If above doesn't work, Bigtop is a fully open sourced solution for you to build up your own Hadoop Distribution. You can build the entire stack up against your desired Distro. from scratch. We've well crafted tools and dockerlized framework to support it. If you what to do so, raise your need at [email protected] mailing list. We'd be happy to help.

I found the following: https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+1.2.0+Release

Yes. You're looking for the right doc. And this is exactly what I've mentioned above: though it's for CentOS 7, you can try the repo on RHEL 7.

Upvotes: 1

Related Questions