Deepali
Deepali

Reputation: 25

Spark-scala : withColumn is not a member of Unit

I'm trying to read a CSV file in spark using spark df. The file doesn't have a header column but I want to have the header column. How to do that? I don't know if I'm correct or not, I wrote this command -> val df = spark.read.format("csv").load("/path/genchan1.txt").show()

and got the column name as _c0 and _c1 for columns. Then I tried to change the column name to the desired names using: val df1 = df.withColumnRenamed("_c0","Series") , But I'm getting "withColumnRenamed" is not a member on unit.

PS: I have imported spark.implicits._ and spark.sql.functions already.

Please help me know is there any way to add a column header to dataset and why I'm getting this issue.

Upvotes: 1

Views: 3408

Answers (2)

Shantanu Kher
Shantanu Kher

Reputation: 1054

If you know the structure of CSV file beforehand, defining a schema and attaching it to df while loading data to it is a better solution.

Sample code for quick reference -

import org.apache.spark.sql.types._

val customSchema = StructType(Array(
  StructField("Series", StringType, true),
  StructField("Column2", StringType, true),
  StructField("Column3", IntegerType, true),
  StructField("Column4", DoubleType, true))
)

val df = spark.read.format("csv")
.option("header", "false") #since your file does not have header
.schema(customSchema)
.load("/path/genchan1.txt")

df.show()

Upvotes: 1

Som
Som

Reputation: 6323

return type of show is Unit. Please remove show from the end.

val df = spark.read.format("csv").load("/path/genchan1.txt")
df.show()

you can then use all df functionality-

val df1 = df.withColumnRenamed("_c0","Series") 

Upvotes: 3

Related Questions