How to start spark (with thrift server) in non-blocking mode that hive can update and reload data into spark (table-looking)

Question

We do have problems with table lookings. We need simultanious access from hive and spark (with thrift server) to tables. However our problem is running spark with thrift server result in a table looking.

We're running on an Amazon AWS EMR Cluster with Hive, Spark and thrift Server 2.

We'd like to update with hive an s3 storage and load this aggregated data into spark in background periodically. Spark meanwhile is allways on with thrift server loaded and has the same data loaded from s3, to do realtime aggregations on this data. Spark does not need write access on this data.

The problem is running the periodicall data-loading tasks on hive result in freeze of the job.

We think the meta-store may be locked by spark / thrift server, blocking hive from updating and reloading data into spark. (But not sure about this)

Is it possible to start spark and thrift server in read only non-blocking mode?

What may cause the problem? Anyone experienced similar problems?

How to start spark (with thrift server) in non-blocking mode that hive can update and reload data into spark (table-looking)

Answers (1)

Related Questions