Parv bali
Parv bali

Reputation: 157

My schema is nullable but still giving ArrayIndexOutOfBoundsException: 1

Schema is given below:

root
 |-- reviewText: string (nullable = true)

Selected the row to perform the operation

val extracted_reviews = sql("select reviewText from book").collect

had loaded AFINN here

val reviewSenti = extracted_reviews.map(reviewText => { val reviewWordsSentiment = reviewText(1).toString.split(" ").map(word => {
  var senti: Int = 0;
  if (AFINN.lookup(word.toLowerCase()).length > 0) {
    senti = AFINN.lookup(word.toLowerCase())(0)
  }
  senti
})
  val reviewSentiment = reviewWordsSentiment.sum
  (reviewSentiment ,reviewText.toString)
})

I am already having reviewText as null in schema so why it is giving this error:

java.lang.ArrayIndexOutOfBoundsException: 1
  at org.apache.spark.sql.catalyst.expressions.GenericRow.get(rows.scala:200)
  at org.apache.spark.sql.Row$class.apply(Row.scala:157)
  at 
  at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
  ... 52 elided

Upvotes: 1

Views: 122

Answers (1)

koiralo
koiralo

Reputation: 23109

collect() returns Array[Row] so to get the value from it you can use reviewText.getString(0)

val reviewSenti = extracted_reviews.map(reviewText => 
    val reviewWordsSentiment = reviewText.getString(0).split(" ").map(...)
)

Upvotes: 1

Related Questions