Reputation: 145
Can't find the jar that has org.apache.spark.sql.Row class
I opened up the jar file spark-sql_2.11-2.4.3.jar but org.apache.spark.sql.Row class is not there. But the documentation in Spark says it should be there. https://spark.apache.org/docs/2.1.1/api/java/org/apache/spark/sql/Row.html
import org.apache.spark.sql.SparkSession
import com.microsoft.azure.sqldb.spark.config.Config
import com.microsoft.azure.sqldb.spark.connect._
object BulkCopy extends App{
val spark = SparkSession
.builder()
.appName("Spark SQL data sources example")
.config("spark.some.config.option", "some-value")
.getOrCreate()
var df = spark.read.parquet("parquet")
val bulkCopyConfig = com.microsoft.azure.sqldb.spark.config.Config(Map(
"url" -> jdbcHostname,
"databaseName" -> jdbcDatabase,
"user" -> jdbcUsername,
"password" -> jdbcPassword,
"dbTable" -> "dbo.RAWLOG_3_1_TEST1",
"bulkCopyBatchSize" -> "2500",
"bulkCopyTableLock" -> "true",
"bulkCopyTimeout" -> "600"
))
df.bulkCopyToSqlDB(bulkCopyConfig)
Error:(17, 13) Symbol 'type org.apache.spark.sql.Row' is missing from the classpath.
This symbol is required by 'type org.apache.spark.sql.DataFrame'.
Make sure that type Row is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'package.class' was compiled against an incompatible version of org.apache.spark.sql.
var df = spark.read.parquet("parquet")
Upvotes: 4
Views: 1533
Reputation: 145
I was able to download all the jars from this site. https://jar-download.com/?search_box=org.apache.spark%20spark-core https://jar-download.com/?search_box=org.apache.spark%20spark.sql
Upvotes: 0
Reputation: 974
org.apache.spark.sql.Row
class is not a part of jar file spark-sql_2.11-2.4.3.jar
. Instead you can find it in spark-catalyst_2.11-2.4.3.jar
. The following spark sql library dependency is dependent on spark-catalyst lib and your build tool (maven/sbt) should be able to resolve that automatically for you
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.4.3</version>
</dependency>
OR
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"
Here're the dependencies for spar-sql lib:
Upvotes: 2