Pash0002
Pash0002

Reputation: 110

Can we extract the queries which was ran on hive through metadata

Basically, I want metadata of queries that ran on the hive in one day. I looked into the metadata which is provided by hive in MySql. But not able to find any table which stores query related information.

Upvotes: 0

Views: 680

Answers (1)

Pash0002
Pash0002

Reputation: 110

After doing some research found that we can extract the MapReduce jobs using the History Server REST API for Hadoop.

And From that you'll the the Job related information.

To get the query you need request for particular job's conf

<history_server_http_address:port>/ws/v1/history/mapreduce/jobs/<JOB_ID>/conf

From this you'll get all configs. For query you need look hive.query.string

https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html#History_Server_REST_APIs.

We can also extract the job's JSON and configuration of that job in XML from the hdfs location. For this, you need the value of the mapreduce.jobhistory.done-dir property.

Then you fire hdfs get command to get the data.

hdfs dfs -get <resource-manager-path>/<year-dir>/<month-dir>/<day-dir>  <destination-local-dir>

Upvotes: 1

Related Questions