RJK
RJK

Reputation: 239

What is version library spark supported SparkSession

Code Spark with SparkSession.

   import org.apache.spark.SparkConf
   import org.apache.spark.SparkContext 

   val conf = SparkSession.builder
  .master("local")
  .appName("testing")
  .enableHiveSupport()  // <- enable Hive support.
  .getOrCreate()

Code pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.cms.spark</groupId>
    <artifactId>cms-spark</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>cms-spark</name>

    <pluginRepositories>
        <pluginRepository>
            <id>scala-tools.org</id>
            <name>Scala-tools Maven2 Repository</name>
            <url>http://scala-tools.org/repo-releases</url>
        </pluginRepository>
    </pluginRepositories>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>1.6.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <version>1.6.0</version>
        </dependency>

        <dependency>
            <groupId>com.databricks</groupId>
            <artifactId>spark-csv_2.10</artifactId>
            <version>1.4.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_2.10</artifactId>
            <version>1.5.2</version>
        </dependency>

        <dependency>
            <groupId>org.jsoup</groupId>
            <artifactId>jsoup</artifactId>
            <version>1.8.3</version>
        </dependency>

    </dependencies>

    <build>
        <plugins>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.5.3</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id> <!-- this is used for inheritance merges -->
                        <phase>install</phase> <!-- bind to the packaging phase -->
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>

    </build>
</project>

I have some problem. I create code spark with SparkSession, iam get trouble SparkSession not find in library SparkSql. So iam can't run code spark. Iam question what is version to find SparkSession in library Spark. I give code pom.xml.

Thanks.

Upvotes: 10

Views: 31541

Answers (2)

mat77
mat77

Reputation: 436

you need both core and SQL artifacts

<repositories>
    <repository>
        <id>cloudera</id>
        <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.0.0-cloudera1-SNAPSHOT</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.0.0-cloudera1-SNAPSHOT</version>
    </dependency>
</dependencies> 

Upvotes: 22

Vitalii Kotliarenko
Vitalii Kotliarenko

Reputation: 2967

You need Spark 2.0 to use SparkSession. It's available in Maven central snapshot repository as for now:

groupId = org.apache.spark
artifactId = spark-core_2.11
version = 2.0.0-SNAPSHOT

The same version have to be specified for other Spark artifacts. Note, that 2.0 is still in beta and expected to be stable in about a month, AFAIK.

Update. Alternatively, you can use Cloudera fork of Spark 2.0:

groupId = org.apache.spark
artifactId = spark-core_2.11
version = 2.0.0-cloudera1-SNAPSHOT

Cloudera repository has to be specified in your Maven repositories list:

<repository>
   <id>cloudera</id>
   <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>

Upvotes: 2

Related Questions