Reputation: 9522
We do have problems with table lookings. We need simultanious access from hive and spark (with thrift server) to tables. However our problem is running spark with thrift server result in a table looking.
We're running on an Amazon AWS EMR Cluster with Hive, Spark and thrift Server 2.
We'd like to update with hive an s3 storage and load this aggregated data into spark in background periodically. Spark meanwhile is allways on with thrift server loaded and has the same data loaded from s3, to do realtime aggregations on this data. Spark does not need write access on this data.
The problem is running the periodicall data-loading tasks on hive result in freeze of the job.
We think the meta-store may be locked by spark / thrift server, blocking hive from updating and reloading data into spark. (But not sure about this)
Is it possible to start spark and thrift server in read only non-blocking mode?
What may cause the problem? Anyone experienced similar problems?
Upvotes: 0
Views: 776
Reputation: 5223
How is your metastore configured ? Does it use Derby for the metastore ? With the default configuration it uses Derby, which does not support multiple concurrent users. If so, you should change it to use something like MySQL, which does support multiple users.
Upvotes: 0