PicxyB
PicxyB

Reputation: 847

Reading HIVE table in PySpark: Mismatched input '-' expecting EOF

I am reading some HIVE tables using a spark session:

from pyspark.sql import SparkSession

spark = (
    SparkSession
    .builder
    .appName("Test")
    .enableHiveSupport()
    .getOrCreat()
)

def read(table_path):
    return spark.read.table(table_path)

read("aaaa.bbbb")
read("aaaa.cccc")
read("dddd.eeee")

Most of the time, I have no issues. But sometimes I got this error:

Mismatched input '-' expecting <EOF>

Do you know if there is an option to avoid this error? Also, can you help me to find the documentation? I searched but found nothing.

Thank you:)

Upvotes: 0

Views: 164

Answers (1)

Gaurang Shah
Gaurang Shah

Reputation: 12960

when you want to read a table you need provide table name. not table path. you provide table path when you are trying to read files directly. So not sure exactly what is you requirement.

However there are two way you can read the table

from os.path import abspath
from pyspark.sql import SparkSession
from pyspark.sql import Row

# warehouse_location points to the default location for managed databases and tables
warehouse_location = abspath('spark-warehouse')

spark = SparkSession \
    .builder \
    .appName("Python Spark SQL Hive integration example") \
    .config("spark.sql.warehouse.dir", warehouse_location) \
    .enableHiveSupport() \
    .getOrCreate()

Method 1 : use query

df = spark.sql("select * from database.table_name") 

Method 2: Use table API

df = spark.table("database.table_na,e")

Documentation: https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html

Upvotes: 0

Related Questions