Samuel
Samuel

Reputation: 225

'SparkSession' object has no attribute 'textFile'

I am currently using SparkSession and was told that SparkContext is within SparkSession. However, when doing up the code, it is showing me an error that SparkContext does not exist in SparkSession

Below is the code that i have done

import findspark
findspark.init()
from pyspark.sql import SparkSession, Row
import collections

spark = SparkSession.builder.config("spark.sql.warehouse.dir", "file://C:/temp").appName("SparkSQL").getOrCreate()

lines = spark.textFile('C:/Users/file.xslx')

The error is as follow:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_59944/722806425.py in <module>
----> 1 lines = spark.textFile('C:/Users/samue/bt4221_spark/exercise/week5/customer-orders.xslx')

AttributeError: 'SparkSession' object has no attribute 'textFile'

My current version of findspark: 1.4.2 pyspark: 3.0.3

I dont think its related to any version issue. Any help is greatly appreciated! :)

Upvotes: 2

Views: 3390

Answers (1)

Mohana B C
Mohana B C

Reputation: 5487

textFile is present in SparkContext class not in SparkSession.

spark.sparkContext.textFile('filepath')

Upvotes: 7

Related Questions