RLT
RLT

Reputation: 191

No Module Named 'delta.tables'

I am getting the following error for the code below, please help:

   from delta.tables import *
   ModuleNotFoundError: No module named 'delta.tables'
   INFO SparkContext: Invoking stop() from shutdown hook

Here is the code: '''

   from pyspark.sql import *

   if __name__ == "__main__":
     spark = SparkSession \
        .builder \
        .appName("DeltaLake") \
        .config("spark.jars", "delta-core_2.12-0.7.0") \
        .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
        .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \
        .getOrCreate()

    from delta.tables import *

    data = spark.range(0, 5)

   data.printSchema()

'''

An online search suggesting verifying the scala version to delta core jar version. Here is the scala & Jar versions

"delta-core_2.12-0.7.0"

"Using Scala version 2.12.10, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_221"

Upvotes: 11

Views: 20678

Answers (3)

Stephen
Stephen

Reputation: 309

or you can also

pip install delta-spark

delta-spark pip page

Upvotes: 14

Or b
Or b

Reputation: 816

According to delta package documentation, there is a python file named tables. You should clone the repository and copy the delta folder under python/delta to your site packages path (i.e. ..\python37\Lib\site-packages). then restart python and your code runs without the error.

I am using Python3.5.3, pyspark==3.0.1,

Upvotes: 5

Bram
Bram

Reputation: 406

There is a difference between spark.jars and spark.jars.packages. Since you are following the Quick Start, try replacing

.config("spark.jars", "delta-core_2.12-0.7.0")

with

.config("spark.jars.packages", "io.delta:delta-core_2.12:0.7.0")

Upvotes: 4

Related Questions