Romee Zhou
Romee Zhou

Reputation: 41

Interact with HBase running in Docker through Java code

I am quite new to Hbase and Java. I managed to run an HBase image in docker, I can interact with using the hbase shell smoothly. And I also have access to the UI for monitoring HBase. However, when I tried to write some Java code to interact with this HBase running in the docker, I encountered some problem.

My Java code is as follow:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;

public class HBaseExample {
    public static void main(String[] args) {
        try {
            // HBase Configuration
            Configuration config = HBaseConfiguration.create();
            config.set("hbase.zookeeper.quorum", "localhost");
            config.set("hbase.zookeeper.property.clientPort", "2181");

            // Establish connection
            Connection connection = ConnectionFactory.createConnection(config);

            // Access HBase table
            Table table = connection.getTable(TableName.valueOf("my_table"));

            // Retrieve data for a specific row
            Get get = new Get(Bytes.toBytes("row1"));
            Result result = table.get(get);
            byte[] value = result.getValue(Bytes.toBytes("my_cf1"), Bytes.toBytes("col1"));

            System.out.println("Value: " + Bytes.toString(value));

            // Close resources
            table.close();
            connection.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

And when I run this program, it doesn't print out the desired value in my hbase table, the terminal output is:

log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
java.net.SocketTimeoutException: callTimeout=60000, callDuration=68630: can not resolve c47da4f0ee21,16020,1730323430132 row 'my_table,row1,99999999999999' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=c47da4f0ee21,16020,1730323430132, seqNum=-1
        at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:159)
        at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.UnknownHostException: can not resolve c47da4f0ee21,16020,1730323430132
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.createAddr(AbstractRpcClient.java:430)
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.createBlockingRpcChannel(AbstractRpcClient.java:507)
        at org.apache.hadoop.hbase.client.ConnectionImplementation.lambda$getClient$2(ConnectionImplementation.java:1214)
        at org.apache.hadoop.hbase.util.CollectionUtils.computeIfAbsentEx(CollectionUtils.java:61)
        at org.apache.hadoop.hbase.client.ConnectionImplementation.getClient(ConnectionImplementation.java:1212)
        at org.apache.hadoop.hbase.client.ReversedScannerCallable.prepare(ReversedScannerCallable.java:111)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.prepare(ScannerCallableWithReplicas.java:399)
        at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
        ... 4 more

On the docker container side, the log shows something like:

hbase-1  | ==> /hbase/logs/zookeeper.log <==
hbase-1  | 2024-10-30 21:00:20,378 INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxnFactory: Accepted socket connection from /192.168.65.1:52662
hbase-1  | 2024-10-30 21:00:20,383 INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.ZooKeeperServer: Client attempting to establish new session at /192.168.65.1:52662
hbase-1  | 2024-10-30 21:00:20,392 INFO  [SyncThread:0] server.ZooKeeperServer: Established session 0x192df1587a8000a with negotiated timeout 90000 for client /192.168.65.1:52662
hbase-1  | 
hbase-1  | ==> /hbase/logs/SecurityAuth.audit <==
hbase-1  | 2024-10-30 21:01:04,285 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 172.18.0.2:54178, version=2.1.3, sasl=false, ugi=root (auth:SIMPLE), service=AdminService

I checked that the container network is under the mode "bridge", and I can successfully telnet to "localhost 2181"

Upvotes: 0

Views: 49

Answers (1)

Jason H.
Jason H.

Reputation: 1

The unresolvable hostname c47da4f0ee21 appears to be a default network name generated by Docker. I suspect that ZooKeeper is returning c47da4f0ee21 as the region server address, and your Java client is unable to resolve this hostname because it is not part of the Docker bridge network. So one way to resolve this is to:

  1. Create a network using docker; call this hbasenet
  2. Attach the HBase container to hbasenet when starting the container.
  3. Package the Java code into an image. After HBase is healthy, start the container with Java Code, attaching it to the hbasenet .

To further analyze the problem, I'd suggest looking through the HBase read flow. Briefly, it involves the following steps:

Client Initialization:

  • The client sets up the configuration and establishes a connection with Zookeeper.

Locating the Data:

  • Uses ZooKeeper and the meta table to find the RegionServer responsible for the target data.

Request Processing:

  • The client sends the Get request to the RegionServer.
  • The RegionServer processes the request by checking caches and possibly reading from HDFS.

Data Retrieval:

  • The RegionServer returns the requested data to the client.

Upvotes: 0

Related Questions