Reputation: 322
I have imported a table into HDFS as
fields-terminated-by '|'
sqoop import \
--connect jdbc:mysql://connection \
--username \
--password \
--table products \
--as-textfile \
--target-dir /user/username/productsdemo \
--fields-terminated-by '|'
after that, I am trying to read it as RDD using spark-shell version 1.6.2
var productsRDD = sc.textFile("/user/username/productsdemo")
and converting it into a data frame
var productsDF = productsRDD.map(product =>{
var o = product.split("|");
products(o(0).toInt,o(1).toInt,o(2),o(3),o(4).toFloat,o(5))
}).toDF("product_id", "product_category_id","product_name","product_description","product_price","product_image" )
But When I try to print the output it is throwing the below exception.
java.lang.NumberFormatException: For input string: "|"
Why I am getting this error can anyone help me out of this?
Upvotes: 0
Views: 532
Reputation: 13506
split
are use regex
to do the split string, since |
is a special character in regex means OR
You need to use \\|
instead of |
when split
var o = product.split("\\|");
Upvotes: 2