Get specific row by using SparkR

Question

I have a dataset "data" in SparkR of type DataFrame. I want to get entry number 50 for example. In R I simply type data[50,] but when I do this in sparkR I get this message

"Error: object of type 'S4' is not subsettable"

What can I do to solve this ?

Furthermore: How can I add a column (of the same column-size) to the data?

Wannes Rosiers · Accepted Answer

The only thing you can do is

all50 <- take(data,50)
row50 <- tail(all50,1)

SparkR has no row.names, hence you can not subset on an index. This approach works, but you do not want to use it on big datasets.

Also the second part of your question is not possible yet. You can only add columns based on numbers (e.g. a constant column) or by making transformations of columns that belong to your DataFrame. This was actually already asked in How to do bind two dataframe columns in sparkR?.

Get specific row by using SparkR

Answers (2)

Related Questions