Reputation: 13753
I am using hive with drill.
Storage Plugin info:
{
"type": "hive",
"enabled": true,
"configProps": {
"hive.metastore.uris": "",
"javax.jdo.option.ConnectionURL": "jdbc:mysql://localhost:3306/metastore_hive",
"javax.jdo.option.ConnectionDriverName": "com.mysql.jdbc.Driver",
"javax.jdo.option.ConnectionUserName": "root",
"javax.jdo.option.ConnectionPassword": "root",
"hive.metastore.warehouse.dir": "/user/hive/warehouse",
"fs.default.name": "file:///",
"hive.metastore.sasl.enabled": "false"
}
}
It's working fine for querying and all.
Then I wanted to check whether it automatically discover newly created tables in hive or not.
I started drill in embedded mode and used a particular database in hive using
use hive.testDB;
Here testDB
is a database in Hive with tables t1
& t2
. Then I queried:
show tables;
It gave me table names
t1
t2
I created a table t3
in hive and again fired show tables;
in Drill.
It's still showing t1 t2
. After 5-10 min I fired show tables;
again and it's showing t1 t2 t3
.
I think it should show t3
immediately after adding t3
in hive.
What can be reason for this behavior and how drill is handling it internally?
Upvotes: 0
Views: 149
Reputation: 13753
I asked the same question on Drill's community & found it's solution:
Drill uses cache in its HiveMetaStoreClient, in order to reduce the overhead to access HiveMetaStore. By default, the cache TTL is 60 seconds.
There is a resolved issue on this. This issue is incorporated in Drill 1.5. Cache TTL can be made shorter by configuring hive storage plugin.
If you want to make the cache TTL shorter, you could configure hive storage plugin.
Sample hive plugin (with cache ttl 5 seconds):
"configProps": {
"hive.metastore.uris": "thrift://hive.ip.address.com:9083",
... // other setting,
"hive.metastore.cache-ttl-seconds": "5",
"hive.metastore.cache-expire-after": "access"
}
Check more details here.
Upvotes: 1