Reputation: 11619
the question is self-contained. I deployed some cluster and now I want to upgrade my hadoop version. I tried to check the bdutil
or gsutil
, I didn't find how to make it work !
Upvotes: 0
Views: 75
Reputation: 10697
Unfortunately, since the various paths, library dependencies, and daemon processes are fairly different between Hadoop 1 and Hadoop 2, there's no easy way to upgrade in-place right now. In particular, any customizations made to the cluster are likely to break even if the library upgrades and daemon changes are coordinated, so in general, it's much easier and safer to simply delete and recreate the cluster.
To help prevent getting overly stuck on a single cluster instance and benefit from the agility of being able to re-deploy reproducible clusters from scratch, the best-practice recommendation is to isolate any customizations you might have into custom "_env.sh" files. The spark_env.sh extension is a good example of how to mix in extra customizations on top of an existing bdutil installation. For another example, if you simply want to install openjdk-7-jdk
on all machines at the end of a bdutil installation, you'd create files install_jdk.sh
and add_jdk_env.sh
:
# file: install_jdk.sh
sudo apt-get install openjdk-7-jdk
And for add_jdk_env.sh:
# file: add_jdk_env.sh
# Create a command group which references the new install_jdk.sh file.
COMMAND_GROUPS+=(
"install_jdk:
install_jdk.sh
"
)
# Run that command group on master and workers.
COMMAND_STEPS+=(
'install_jdk,install_jdk'
)
And finally, you simply mix it in for your bdutil deployment: ./bdutil -e add_jdk_env.sh deploy
Upvotes: 2