Reputation: 113
i want to run this program i am new to scala spark I've got the "there were compilation error" any one can help me with this?
enter code here
package main.scala.com.matthewrathbone.spark
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import scala.collection.Map
class ExampleJob(sc: SparkContext) {
// reads data from text files and computes the results. This is what you test
def run(t: String, u: String) : RDD[(String, String)] = {
val transactions = sc.textFile(t)
val newTransactionsPair = transactions.map{t =>
val p = t.split(" ")
(p(2).toInt, p(1).toInt)
}
val users = sc.textFile(u)
val newUsersPair = users.map{t =>
val p = t.split(" ")
(p(0).toInt, p(3))
}
val result = processData(newTransactionsPair, newUsersPair)
return sc.parallelize(result.toSeq).map(t => (t._1.toString, t._2.toString))
}
def processData (t: RDD[(Int, Int)], u: RDD[(Int, String)]) : Map[Int,Long] = {
var jn = t.leftOuterJoin(u).values.distinct
return jn.countByKey
}
}
object ExampleJob {
def main(args: Array[String]) {
val transactionsIn = Resource.fromFile("/home/ali/Desktop/main/scala/com/matthewrathbone/spark/transactions.txt")
val usersIn = Resource.fromFile("/home/ali/Desktop/main/scala/com/matthewrathbone/spark/users.txt")
//val transactionsIn = args(1)
// val usersIn = args(0)
val conf = new SparkConf().setAppName("SparkJoins").setMaster("local")
val context = new SparkContext(conf)
val job = new ExampleJob(context)
val results = job.run(transactionsIn, usersIn)
//val output = args(2)
val output = Resource.fromFile("/home/ali/Desktop/main/scala/com/matthewrathbone/spark/out.txt")
results.saveAsTextFile(output)
context.stop()
}
}
i was try that whit the input from args but the error was just the same. this code does some operation on tow text file in spark-shell i have also some times get error of first line of package definition.
thanks in advance
Upvotes: 0
Views: 1894
Reputation: 113
I found my problem, there was mismatch in parameters of-course but i changed it to string and the problem not solved, after that i used sbt for packaging and compiling, sbt added the libraries automatically and the program runs correctly, thank you for answers
Upvotes: 0
Reputation: 39
Your run
method takes 2 parameters which are 2 String (t: String, u: String)
but in your main
method, you are invoking it with 2 Resource
. You want to make change the transactionsIn
and usersIn
as String, like this:
val transactionsIn = "/home/ali/Desktop/main/scala/com/matthewrathbone/spark/transactions.txt"
val usersIn = "/home/ali/Desktop/main/scala/com/matthewrathbone/spark/users.txt"
...//context initialization
val results = job.run(transactionsIn, usersIn)
I'm also new to Scala, but I don't think you should use return
in the code, see this SO.
Upvotes: 1