CARTman
CARTman

Reputation: 747

Extract only the value (not the named value) of a field from any identified row of a dataframe

I am learning pySpark and have a question which I think is a fundamental one yet I am unable to crack it ..

Let's assume I have the following code

lettersDF = sqlContext.createDataFrame([('A',), ('B',), ('C',), ('D',), ('E', )], ['word'])

now I want to print 3rd row of the column 'word'

print lettersDF.head(3)[2] 
Row(word=u'C')

I just want to print 'C'.. how do I do it? I do not want this "dict" output, rather I want a "list" like output

Can someone please explain the how head(), tail(), take() and first() or similar "Action" keywords work? Somehow I think I am missing something fundamental

Upvotes: 0

Views: 358

Answers (1)

Alexis Benichoux
Alexis Benichoux

Reputation: 800

Yes, it comes as a Row object (pyspark.sql.types.Row), you can convert it

print lettersDF.head(3)[2].asDict()
{'word': u'C'}

Upvotes: 1

Related Questions