Reading a Excel file in Spark with an integer column

Question

I have a group of Excel sheets, that I am trying to read via spark through com.crealytics.spark.excel package. In my excel sheet I have a column Survey ID that contains integer IDs. When I read the data through spark I see the values are converted to double value.

How can I retain the format of the integer values while reading from excel sheet ?

This is what I tried :

val df = spark.read.format("com.crealytics.spark.excel")
      .option("location", )
      .option("useHeader", "true")
      .option("treatEmptyValuesAsNulls", "true")
      .option("inferSchema", "true")
      .option("addColorColumns","False")
      .load()

Actual Value

Value read via Spark

+-----------+
|  Survey ID|
+-----------+
|1.7632889E7|
|1.7632889E7|
|1.7632934E7|
|1.7633233E7|
|1.7633534E7|
|1.7655812E7|
|1.7656079E7|
|1.7930478E7|
|1.7944498E7|
|1.8071246E7|

If I cast the column to integer I get the required formatted data. But is there a better way to do this?

val finalDf=df.withColumn("Survey ID", col("Survey ID").cast(sql.types.IntegerType))

Reading a Excel file in Spark with an integer column

Answers (1)

Related Questions