Reputation: 345
I know that it is possible to convert a dataframe column into a list using something like:
dataFrame.select("ColumnName").rdd.map(r => r(0)).collect()
Let's say I already know the schema of the dataframe and correspondingly I created a case class such as :
case class Synonym(URI: String, similarity: Double, FURI: String)
is there an efficient way to get a list of Synonym objects from the data of the dataframe?
In other words, I am trying to create a mapper that would convert each row of the dataframe into an object of my case class and then return this object in a way that I can have a list of these objects at the end of the operation. is this possible in an efficient nice way?
Upvotes: 2
Views: 14314
Reputation: 31
Use typed Dataset
:
df.select("URI", "similarity", "FURI").as[Synonym].collect
Upvotes: 3
Reputation: 37822
Use as[Synonym]
to get a Dataset[Synonym]
which you can then collect
to get an Array[Synonym]
:
val result = dataframe.as[Synonym].collect()
Upvotes: 9