always Hive Job running in-process local Hadoop

Question

When I set this property in hive-site.xml


  hive.exec.mode.local.auto
  false

Hive always runs the hadoop job locally.

Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 55
Job running in-process (local Hadoop)

Why does this happen?

Vinkal · Accepted Answer

As mentioned in HIVE-2585,Going forward Hive will assume that the metastore is operating in local mode if the configuration property hive.metastore.uris is unset, and will assume remote mode otherwise.

Ensure following property is set in Hive-site.xml:


    hive.metastore.uris
    :9083


     hive.metastore.local
    false

The hive.metastore.local property is no longer supported as of Hive 0.10; setting hive.metastore.uris is sufficient to indicate that you are using a remote metastore.

EDIT:

Starting with release 0.7, Hive also supports a mode to run map-reduce jobs in local-mode automatically. The relevant options are hive.exec.mode.local.auto, hive.exec.mode.local.auto.inputbytes.max, and hive.exec.mode.local.auto.tasks.max:

hive> SET hive.exec.mode.local.auto=false;

Note that this feature is disabled by default. If enabled, Hive analyzes the size of each map-reduce job in a query and may run it locally if the following thresholds are satisfied:

1. The total input size of the job is lower than: hive.exec.mode.local.auto.inputbytes.max (128MB by default)

2. The total number of map-tasks is less than: hive.exec.mode.local.auto.tasks.max (4 by default)

3. The total number of reduce tasks required is 1 or 0.

So for queries over small data sets, or for queries with multiple map-reduce jobs where the input to subsequent jobs is substantially smaller (because of reduction/filtering in the prior job), jobs may be run locally.

Reference: Hive Getting started

always Hive Job running in-process local Hadoop

Answers (1)

Related Questions