Marek Grzenkowicz
Marek Grzenkowicz

Reputation: 17333

Impala query stuck in status Executing

I have a query CREATE TABLE foobar AS SELECT ... that runs successfully in Hue (the returned status is Inserted 986571 row(s)) and takes a couple seconds to complete. However, in Cloudera Manager its status - after more than 10 minutes - still says Executing.

Is it a bug in Cloudera Manager or is this query actually still running?

Upvotes: 2

Views: 5290

Answers (1)

Matt
Matt

Reputation: 4334

When Hue executes a query, it leaves the query open so that users can page through results at their own pace. (Of course, this behavior isn't very useful for DDL statements.) That means that Impala still considers the query to be executing, even if it is not actively using CPU cycles (keep in mind it is still holding memory!). Hue will close the query if explicitly told to, or when the page/session is closed, e.g. using the hue command:

> build/env/bin/hue close_queries --help

Note that Impala has a query option to automatically 'timeout' queries after a period of time, see query_timeout_s. Hue sets this to 10 minutes by default, but you can override it in the hue.ini settings.

One thing to note is that when queries 'time out', they are cancelled but not closed, i.e. the query will remain "in flight" with a CANCELLED status. The reason for this is so that users (or tools) can continue to observe the query metadata (e.g. query profile, status, etc.), which would not be available if the query is fully closed and thus deregistered from the impalad. Unfortunately these cancelled queries may still hold some non-negligible resources, but this will be fixed with IMPALA-1575.

More information: Hive and Impala queries life cycle

Upvotes: 5

Related Questions