Reputation: 31
I am trying to run a simple hello world spark application
This is my code
package com.sd.proj.executables
import org.apache.spark.sql.functions.lit
import org.apache.spark.sql.{DataFrame, SparkSession}
class SparkConn {
def getSparkConn(caller:String) : SparkSession = {
val conf = new SparkConf().setAppName(caller)
val spark: SparkSession = SparkSession.builder.config(conf).getOrCreate()
spark
}
}
object HelloSpark {
def sparkDF()(implicit spark:SparkSession):DataFrame = {
spark.emptyDataFrame
.withColumn("Title",lit("Hello World!!!"))
}
def main(args:Array[String]):Unit ={
val sparkConn = new SparkConn()
implicit val spark = sparkConn.getSparkConn(this.getClass.getName)
val df = sparkDF()
df.show(false)
spark.stop()
}
}
This is my build.gradle
plugins {
id 'scala'
id 'idea'
id 'org.scoverage' version '7.0.0'
}
repositories {
mavenCentral()
}
sourceSets{
main{
scala.srcDirs = ['src/main/scala']
resources.srcDirs = ['src/main/resources']
}
test{
scala.srcDirs = ['src/test/scala']
resources.srcDirs = ['src/test/resources']
}
}
dependencies {
//scala
implementation 'org.scala-lang:scala-library:2.12.15'
implementation 'org.scala-lang:scala-reflect:2.12.15'
implementation 'org.scala-lang:scala-compiler:2.12.15'
//spark
implementation 'org.apache.spark:spark-core_2.12:3.2.0'
implementation 'org.apache.spark:spark-sql_2.12:3.2.0'
//junit
testImplementation 'junit:junit:4.12'
testImplementation 'org.scalatestplus:scalatestplus-junit_2.12:1.0.0-M2'
}
scoverage{
scoverageVersion = "1.4.11"
minimumRate=0.01
}
task fatJar(type:Jar){
zip64 true
manifest {
attributes 'Implementation-Title': 'Gradle Fat Jar',
'Implementation-Version': '0.1',
'Main-Class': 'com.sd.proj.executables.HelloSpark'
}
duplicatesStrategy = DuplicatesStrategy.EXCLUDE
baseName = project.name + '-fat'
from {
configurations.runtimeClasspath.collect {
it.isDirectory() ? it : zipTree(it)
}
}
with jar
}
and this is the project structure
.
├── README.md
├── build
│ ├── classes
│ │ └── scala
│ │ └── main
│ │ └── com
│ │ └── sd
│ │ └── proj
│ │ └── executables
│ │ ├── HelloSpark$.class
│ │ └── HelloSpark.class
│ ├── generated
│ │ └── sources
│ │ └── annotationProcessor
│ │ └── scala
│ │ └── main
│ ├── libs
│ │ ├── HelloSpark-fat.jar
│ │ └── HelloSpark.jar
│ └── tmp
│ ├── compileScala
│ ├── fatJar
│ │ └── MANIFEST.MF
│ ├── jar
│ │ └── MANIFEST.MF
│ └── scala
│ ├── classfileBackup
│ └── compilerAnalysis
│ ├── compileScala.analysis
│ └── compileScala.mapping
├── build.gradle
├── gradle
│ └── wrapper
│ ├── gradle-wrapper.jar
│ └── gradle-wrapper.properties
├── gradlew
├── gradlew.bat
├── settings.gradle
├── spark_submit.sh
└── src
├── main
│ ├── resources
│ └── scala
│ └── com
│ └── sd
│ └── proj
│ └── executables
│ └── HelloSpark.scala
└── test
├── resources
└── scala
my spark-submit script is
#!/bin/bash
echo "Running spark-submit..."
SPARK_HOME=/opt/homebrew/Cellar/apache-spark/3.2.1
export PATH="$SPARK_HOME/bin/:$PATH"
JARFILE=`pwd`/build/libs/HelloSpark-fat.jar
# Run it locally
echo "cmd : ${SPARK_HOME}/bin/spark-submit --class \"com.sd.proj.executables.HelloSpark\" --master local $JARFILE"
${SPARK_HOME}/bin/spark-submit --class "com.sd.proj.executables.HelloSpark" --master local $JARFILE
both scala and spark are installed on my mac
% type spark-submit
spark-submit is /opt/homebrew/bin/spark-submit
% type scala
scala is /opt/homebrew/opt/scala@2.12/bin/scala
When I run above spark-submit it fails saying **Error: Failed to load class com.sd.proj.executables.HelloSpark. **
% bash spark_submit.sh
Running spark-submit...
cmd : /opt/homebrew/Cellar/apache-spark/3.2.1/bin/spark-submit --class "com.sd.proj.executables.HelloSpark" --master local /Users/dsam05/IdeaProjects/HelloSpark/build/libs/HelloSpark-fat.jar
22/11/12 14:35:14 WARN Utils: Your hostname, Soumyajits-MacBook-Air.local resolves to a loopback address: 127.0.0.1; using 192.168.2.21 instead (on interface en0)
22/11/12 14:35:14 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/homebrew/Cellar/apache-spark/3.2.1/libexec/jars/spark-unsafe_2.12-3.2.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Error: Failed to load class com.sd.proj.executables.HelloSpark.
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
I have never run spark on a mac before, can someone please guilde what am I doing incorrectly here this is on a M1 mac, macOS 13
Upvotes: 1
Views: 105
Reputation: 31
Solved this problem, posting as it might help someone else
Removed task fatJar from my build.gradle then added this config
plugins {
//added this entry on top of what is already present
id 'com.github.johnrengelman.shadow' version '7.1.2'
}
//...
//same config as in question above
//...
shadowJar{
zip64 true
}
now the jar is created as HelloSpark-all.jar
but runs perfectly
22/11/19 23:25:12 INFO CodeGenerator: Code generated in 88.62025 ms
22/11/19 23:25:12 INFO CodeGenerator: Code generated in 8.248958 ms
+--------------+
|Title |
+--------------+
|Hello World!!!|
+--------------+
22/11/19 23:25:12 INFO SparkUI: Stopped Spark web UI at http://localhost:4040
Upvotes: 0