Reputation: 747
I am learning pySpark and have a question which I think is a fundamental one yet I am unable to crack it ..
Let's assume I have the following code
lettersDF = sqlContext.createDataFrame([('A',), ('B',), ('C',), ('D',), ('E', )], ['word'])
now I want to print 3rd row of the column 'word'
print lettersDF.head(3)[2]
Row(word=u'C')
I just want to print 'C'.. how do I do it? I do not want this "dict" output, rather I want a "list" like output
Can someone please explain the how head(), tail(), take() and first() or similar "Action" keywords work? Somehow I think I am missing something fundamental
Upvotes: 0
Views: 358
Reputation: 800
Yes, it comes as a Row object (pyspark.sql.types.Row), you can convert it
print lettersDF.head(3)[2].asDict()
{'word': u'C'}
Upvotes: 1