Reputation: 105
I'm trying to load and parse json files (Tweets) but I get back the below error
error: not found: value mapper
Some(mapper.readValue(record, classOf[Tweet]))
And this is the scala script
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.DeserializationFeature
case class Tweet(tweet_id: Int, created_unixtime: Long, created_time: String, lang: String, displayname: String, time_zone: String, msg: String)
val = input.textFile("hdfs://localhost:54310/tmp/data_staging/tweets*") // tweets well loaded
// Parsing them
val result = input.flatMap(record => {
try {
Some(mapper.readValue(record, classOf[Tweet]))
} catch {
case e: Exception => None
}
})
Upvotes: 0
Views: 1236
Reputation: 4207
So, the question is how to load JSON but map it to a case class.
In that case, just use Spark's built-in JSON reader then convert to a DataSet of your case class:
case class Tweet(tweet_id: Int, created_unixtime: Long, created_time: String, lang: String, displayname: String, time_zone: String, msg: String)
val input = spark.read.json("hdfs://localhost:54310/tmp/data_staging/tweets*").as[Tweet]
The assumption here is that the fields in the JSON documents map to your case class. If that's not the case, then you can simply do a map to convert from the Row object to your custom case class.
Upvotes: 1