Reputation: 843
I have a hive table column (Json_String String) it has some 1000 rows, Where each row is a Json of same structure. I am trying read the json in to Dataframe as below
val df = sqlContext.read.json("select Json_String from json_table")
but it is throwing up the below exception
java.io.IOException: No input paths specified in job
is there any way to read all the rows in to dataframe as we do with Json files using wild card
val df = sqlContext.read.json("file:///home/*.json")
Upvotes: 0
Views: 2241
Reputation: 74779
I think what you're asking for is to read the Hive table as usual and transform the JSON column using from_json function.
from_json(e: Column, schema: StructType): Column Parses a column containing a JSON string into a StructType with the specified schema. Returns null, in the case of an unparseable string.
Given you use sqlContext
in your code, I'm afraid that you use Spark < 2.1.0 which then does not offer from_json
(which was added in 2.1.0).
The solution then is to use a custom user-defined function (UDF) to do the parsing yourself.
val df = sqlContext.read.json("select Json_String from json_table")
The above won't work since json operator expects a path or paths to JSON files on disk (not as a result of executing a query against a Hive table).
json(paths: String*): DataFrame Loads a JSON file (JSON Lines text format or newline-delimited JSON) and returns the result as a DataFrame.
Upvotes: 1