Reputation: 157
Schema is given below:
root
|-- reviewText: string (nullable = true)
Selected the row to perform the operation
val extracted_reviews = sql("select reviewText from book").collect
had loaded AFINN here
val reviewSenti = extracted_reviews.map(reviewText => { val reviewWordsSentiment = reviewText(1).toString.split(" ").map(word => {
var senti: Int = 0;
if (AFINN.lookup(word.toLowerCase()).length > 0) {
senti = AFINN.lookup(word.toLowerCase())(0)
}
senti
})
val reviewSentiment = reviewWordsSentiment.sum
(reviewSentiment ,reviewText.toString)
})
I am already having reviewText
as null in schema so why it is giving this error:
java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.spark.sql.catalyst.expressions.GenericRow.get(rows.scala:200)
at org.apache.spark.sql.Row$class.apply(Row.scala:157)
at
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
... 52 elided
Upvotes: 1
Views: 122
Reputation: 23109
collect()
returns Array[Row]
so to get the value from it you can use reviewText.getString(0)
val reviewSenti = extracted_reviews.map(reviewText =>
val reviewWordsSentiment = reviewText.getString(0).split(" ").map(...)
)
Upvotes: 1