Sebastian Loeb Sucre
Sebastian Loeb Sucre

Reputation: 71

spark hivecontext working with queries issues

I'm trying to get information from Jsons to create tables in Hive.

This is my Json schema:

root
|-- info: array (nullable = true)
|    |-- element: struct (containsNull = true)
|    |    |-- stations: array (nullable = true)
|    |    |    |-- element: struct (containsNull = true)
|    |    |    |    |-- bikes: string (nullable = true)
|    |    |    |    |-- id: string (nullable = true)
|    |    |    |    |-- slots: string (nullable = true)
|    |    |    |    |-- streetName: string (nullable = true)
|    |    |    |    |-- type: string (nullable = true)
|    |    |-- updateTime: long (nullable = true)
|-- date: string (nullable = true)
|-- numRecords: string (nullable = true)

I'm using this query:

sqlContext.sql("SELECT info.updateTime FROM STATIONS").foreach(println)

This is what i get:

[WrappedArray(1449098169, 1449108553, 1449098468)]

But i don't know how to put this information in a table to use it after from the Hive console.

I used this:

query.write.save("/home/cloudera/Desktop/select")

And it creates something, but i don't know how to use it.

Thanks

Upvotes: 0

Views: 197

Answers (1)

Roberto Congiu
Roberto Congiu

Reputation: 5213

You can do it in several ways...it depends.

First way: Have the table created in the query

sqlContext.sql("create table mytable AS SELECT info.updateTime FROM STATIONS")
// now you can query mytable

Second way: write the DataFrame with saveAsTable()

sqlContext.sql("SELECT info.updateTime FROM STATIONS").saveAsTable("othertable")

Upvotes: 1

Related Questions