Reputation: 1394
I googled this problem, yet no direct answer related to spark-2.2.0-bin-hadoop2.7. I am trying to read a text file from local directory, but I always get TypeError that name
argument is missing. This is the code in jupyter notebook with Python3:
from pyspark import SparkContext as sc
data = sc.textFile("/home/bigdata/test.txt")
When I run the cell, I get this error:
TypeError Traceback (most recent call last)
<ipython-input-7-2a326e5b8f8c> in <module>()
1 from pyspark import SparkContext as sc
----> 2 data = sc.textFile("/home/bigdata/test.txt")
TypeError: textFile() missing 1 required positional argument: 'name'
Your help is appreciated.
Upvotes: 6
Views: 25106
Reputation: 4603
from pyspark import SparkConf
from pyspark.context import SparkContext
sc = SparkContext.getOrCreate(SparkConf())
data = sc.textFile("my_file.txt")
Display some content
['this is text file and sc is working fine']
Upvotes: 6
Reputation: 474161
You are calling the textFile()
instance method
def textFile(self, name, minPartitions=None, use_unicode=True):
like it was a static method which results into "/home/bigdata/test.txt"
string being used for the self
value leaving name
argument not specified, hence the error.
Create an instance of the SparkContext
class:
from pyspark import SparkConf
from pyspark.context import SparkContext
sc = SparkContext.getOrCreate(SparkConf().setMaster("local[*]"))
data = sc.textFile("/home/bigdata/test.txt")
Upvotes: 14