Reputation: 103
Edit: I added the hbase dependencies defined in the top level pom file to the project level pom and now it can find the package.
I have a scala object to read data from an HBase (0.98.4-hadoop2) table within Spark (1.0.1). However, compiling with maven results in an error when I try to import org.apache.hadoop.hbase.mapreduce.TableInputFormat.
error: object mapreduce is not a member of package org.apache.hadoop.hbase
The code and relevant pom are below:
import org.apache.hadoop.hbase.util.Bytes
import org.apache.spark.rdd.NewHadoopRDD
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.mapred.JobConf
import org.apache.spark.SparkContext
import java.util.Properties
import java.io.FileInputStream
import org.apache.hadoop.hbase.io.ImmutableBytesWritable
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
object readDataFromHbase {
def main(args: Array[String]): Unit = {
var propFileName = "hbaseConfig.properties"
if(args.size > 0){
propFileName = args(0)
}
/** Load properties **/
val prop = new Properties
val inStream = new FileInputStream(propFileName)
prop.load(inStream)
//set spark context and open input file
val sparkMaster = prop.getProperty("hbase.spark.master")
val sparkJobName = prop.getProperty("hbase.spark.job.name")
val sc = new SparkContext(sparkMaster,sparkJobName )
//set hbase connection
val hbaseConf = HBaseConfiguration.create()
hbaseConf.set("hbase.rootdir", prop.getProperty("hbase.rootdir"))
hbaseConf.set(TableInputFormat.INPUT_TABLE, prop.getProperty("hbase.table.name"))
val hBaseRDD = sc.newAPIHadoopRDD(hbaseConf, classOf[org.apache.hadoop.hbase.mapreduce.TableInputFormat],
classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
classOf[org.apache.hadoop.hbase.client.Result]
)
val hBaseData = hBaseRDD.map(t=>t._2)
.map(res =>res.getColumnLatestCell("cf".getBytes(), "col".getBytes()))
.map(c=>c.getValueArray())
.map(a=> new String(a, "utf8"))
hBaseData.foreach(println)
}
}
The Hbase part of the pom file is (hbase.version = 0.98.4-hadoop2):
<!-- HBase -->
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-hadoop2-compat</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-hadoop-compat</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-hadoop-compat</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-protocol</artifactId>
<version>${hbase.version}</version>
</dependency>
I have cleaned the package with no luck. The main thing I need from the import is the classOf(TableInputFormat) to be used in setting the RDD. I suspect that I'm missing a dependency in my pom file but can't figure out which one. Any help would be greatly appreciated.
Upvotes: 2
Views: 5041
Reputation: 891
TableInputFormat
is in the org.apache.hadoop.hbase.mapreduce
packacge, which is part of the hbase-server
artifact, so you will need to add that as a dependency, like @xgdgsc commented:
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>${hbase.version}</version>
</dependency>
Upvotes: 1
Reputation: 1
in spark 1.0 and above:
put all your hbase jar into spark/assembly/lib or spark/core/lib directory. Hopefully youhave docker to automate all this.
a)For CDH version, the relate hbase jar is usually under /usr/lib/hbase/*.jar which are symlink to correct jar.
b) good article to read from http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html
Upvotes: 0