code1234
code1234

Reputation: 121

Restart hive service on AWS EMR

I am very new to HIVE as well AWS-EMR. As per my requirement, i need to create Hive Metastore Outside the Cluster (from AWS EMR to AWS RDS). I followed the instruction given in

http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-dev-create-metastore-outside.html

I made changes in hive-site.xml and able to setup hive metaStore to Amazon RDS mysql server. To bring the changes in action, currently i am rebooting the complete cluster so hive start storing metastore to AWS-RDS. This way it is working.

But i want to avoid rebooting the cluster, is there any way i can restart the service?

Upvotes: 6

Views: 22693

Answers (5)

user 923227
user 923227

Reputation: 2715

For me this approach worked:

  1. Get the pid
  2. Kill the process
  3. Process restarts by itself

Commands for 1 & 2:

ps aux | grep MetaStore
sudo -u hive kill <pid from above>

Here if you are not familiar with ps you can use the following command which will show the headers for PID and only one line of the hive Metastore command:

ps aux | egrep "MetaStore|PID" | grep -v grep

Hive Server restarted by itself. Validate again by ps the pig would have changed.

ps aux | grep MetaStore

Upvotes: 1

Ahmed Kamal
Ahmed Kamal

Reputation: 1488

Just for those who are gonna come from Google

To restart any EMR service

In order to restart a service in EMR, perform the following actions:

  1. Find the name of the service by running the following command:

    initctl list

For example, the YARN Resource Manager service is named “hadoop-yarn-resourcemanager”.

  1. Stop the service by running the following command:

    sudo stop hadoop-yarn-resourcemanager

  2. Wait a few seconds, then start the service by running the following command:

    sudo start hadoop-yarn-resourcemanager

Note: Stop/start is required; do not use the restart command.

  1. Verify that the process is running by running the following command:

    sudo status hadoop-yarn-resourcemanager

Check for the process using ps, and then check the log file for any errors in the log directory /var/log/.

Source : https://aws.amazon.com/premiumsupport/knowledge-center/restart-service-emr/

Upvotes: 15

apeletz
apeletz

Reputation: 81

On EMR 5.x I have found this to work:

hive --service metastore --stop

hive --service metastore --start

Upvotes: 2

user3294904
user3294904

Reputation: 454

 sudo stop hive-metastore
 sudo start hive-metastore

Upvotes: 3

Amal G Jose
Amal G Jose

Reputation: 2546

You don't have to restart the entire cluster. While launching the cluster, you can specify a hive-site.xml file with the details of RDS. If you are not following this option and making the changes manually after launching the cluster, you don't need to restart the entire cluster. Just restart the hive-metastore service alone. Hive metastore is running in the master node only

You can launch the cluster either by using multiple ways.

1) AWS console 2) Using API (Java, Python etc) 3) Using AWS cli

You can keep the hive-site.xml in S3 and perform this activity as a bootstrap step while launching the cluster. AWS api is providing the feature to specify custom hive-site.xml from S3 rather than the one created by default.

If you are using hive from the master machine alone, you don't have to make the changes in all the machines.

An example of specifying the hive-site.xml while launching EMR using aws cli is given below

aws emr create-cluster --name "Test cluster" --ami-version 3.3 --applications Name=Hue Name=Hive Name=Pig \
--use-default-roles --ec2-attributes KeyName=myKey \
--instance-type m3.xlarge --instance-count 3 \
--bootstrap-actions Name="Install Hive Site Configuration",Path="s3://elasticmapreduce/libs/hive/hive-script",\
Args=["--base-path","s3://elasticmapreduce/libs/hive","--install-hive-site","--hive-site=s3://mybucket/hive-site.xml","--hive-versions","latest"]

Upvotes: 0

Related Questions