Reputation: 645
I am newbie to spark and scala. I wanted to execute some spark code from inside a bash script. I wrote the following code.
Scala code was written in a separate .scala
file as follows.
Scala Code:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
println("x="+args(0),"y="+args(1))
}
}
This is the bash script that invokes the Apache-spark/scala code.
Bash Code
#!/usr/bin/env bash
Absize=File_size1
AdBsize=File_size2
for i in `seq 2 $ABsize`
do
for j in `seq 2 $ADsize`
do
Abi=`sed -n ""$i"p" < File_Path1`
Adj=`sed -n ""$j"p" < File_Path2`
scala SimpleApp.scala $Abi $adj
done
done
But then I get the following errors.
Errors:
error:object apache is not a member of package org
import org.apache.spark.SparkContext
^
error: object apache is not a member of package org
import org.apache.spark.SparkContext._
^
error: object apache is not a member of package org
import org.apache.spark.SparkConf
^
error: not found:type SparkConf
val conf = new SparkConf().setAppName("Simple Application") ^
error: not found:type SparkContext
The above code works perfectly if the scala file is written without any spark function (That is a pure scala file), but fails when there are apache-spark imports.
What would be a good way to run and execute this from bash script? Will I have to call spark shell to execute the code?
Upvotes: 0
Views: 2605
Reputation: 2281
set up spark with environment variable and run as @puhlen told with spark-submit -class SimpleApp simple-project_2.11-1.0.jar $Abi $adj
Upvotes: 1