thenakulchawla
thenakulchawla

Reputation: 5254

Using pyspark to connect to hive tables

I am trying to query a Hive table from pyspark.

I am using the below statements:

from pyspark.sql import HiveContext    
HiveContext(sc).sql('from `dbname.tableName` select `*`')

I am very new to hadoop systems. Need to understand what is the correct way to bring some data from a hive table and storing it into a dataframe to further write a program.

Upvotes: 0

Views: 3772

Answers (1)

Sagar Shah
Sagar Shah

Reputation: 118

sqlCtx.sql has access to hive table. You can use it following way.

my_dataframe = sqlCtx.sql("Select * from employees")

my_dataframe.show()

Upvotes: 1

Related Questions