Reputation: 18329
I have access to an edge node to a MapR Hadoop cluster. I have an HBase table named /app/SubscriptionBillingPlatform/Matthew with some fake data. A scan of it in the hbase shell results in this:
I have a very simple Talend Job that should scan the table and log each row:
Here is the configuration for the tHBaseConnection. I obtained the zookeeper quorum and client port from the /opt/mapr/hbase/hbase-0.94.13/conf/hbase-site.xml file:
And here is the configuration for the tHBaseInput:
However, when I SCP the jar file after building/exporting the job and running it on the edge node, I get the following error:
14/08/06 15:51:26 INFO mapr.TableMappingRulesFactory: Could not find MapRTableMappingRules class, assuming HBase only cluster.
14/08/06 15:51:26 INFO mapr.TableMappingRulesFactory: If you are trying to access M7 tables, add mapr-hbase jar to your classpath.
14/08/06 15:51:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/08/06 15:51:26 INFO security.JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
...
Exception in component tHBaseInput_1
org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for /app/SubscriptionBillingPlatform/Matthew,,99999999999999 after 10 tries.
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:991)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:896)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:998)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:900)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:857)
at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:257)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:187)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:142)
at poc2.testhbaseoperations_0_1.TestHBaseOperations.tHBaseInput_1Process(TestHBaseOperations.java:752)
at poc2.testhbaseoperations_0_1.TestHBaseOperations.tHBaseConnection_1Process(TestHBaseOperations.java:375)
at poc2.testhbaseoperations_0_1.TestHBaseOperations.runJobInTOS(TestHBaseOperations.java:1104)
at poc2.testhbaseoperations_0_1.TestHBaseOperations.main(TestHBaseOperations.java:993)
When I told the sys admins about this, who don't know what Talend is, they told me that MapR doesn't use HRegionServers like Cloudera does, and figured that my Talend configurations were wrong.
Any ideas?
Upvotes: 3
Views: 2061
Reputation: 18329
The kicker was these two lines:
INFO mapr.TableMappingRulesFactory: Could not find MapRTableMappingRules class, assuming HBase only cluster.
mapr.TableMappingRulesFactory: If you are trying to access M7 tables, add mapr-hbase jar to your classpath.
If the job doesn't have the mapr-hbase jar on the classpath, it will attempt to submit the job to regular HBase, not MapR-DB. This is why it hangs forever.
You can either add the mapr-hbase jar from /opt/mapr/lib
to the classpath on the shell script, or simply add all the jars from that directory to the classpath.
#!/bin/sh
cd `dirname $0`
ROOT_PATH=`pwd`
java -Xms256M -Xmx1024M -cp /opt/mapr/lib/*:$ROOT_PATH/..
Upvotes: 1
Reputation: 56997
I had a quick go at trying to reproduce this in the Talend Big Data Sandbox but couldn't seem to get your error I'm afraid.
Plugging the error message into Google (with some variations) looks like this is a semi common error outside of Talend so I'd guess that as long as you are properly loading the necessary library and drivers and they are being included in your exported job then it's a configuration issue somewhere on your Hadoop cluster. It also seems to be happening on non MapR distros too.
This issue on the Cloudera community boards seems to have a satisfactory resolution where Oozie was misconfigured and was returning the same error as yourself. It might be worth trying to add:
<property>
<name>oozie.credentials.credentialclasses</name>
<value>hcat=org.apache.oozie.action.hadoop.HCatCredentials</value>
</property>
To Oozie service->Configuration->Oozie Server(default)->Advanced-> Oozie Server Configuration Safety Valve for oozie-site.xml
and restarting the Hive and Oozie services.
Of course that might be complicated by how your Hadoop cluster is administered and if you have a development cluster/local instance to run against that is also suffering from the same issue.
I'd strongly recommend installing the aforementioned Talend Big Data Sandbox or at least the MapR sandbox if you only have a production or production like Hadoop cluster to deploy to.
Upvotes: 0