Scala Spark - Cannot resolve a column name

Question

This should be pretty straightforward, but I'm having an issue with the following code:

val test = spark.read
    .option("header", "true")
    .option("delimiter", ",")
    .csv("sample.csv")

test.select("Type").show()
test.select("Provider Id").show()

test is a dataframe like so:

Type	Provider Id
A	asd
A	bsd
A	csd
B	rrr

Exception in thread "main" org.apache.spark.sql.AnalysisException: 
cannot resolve '`Provider Id`' given input columns: [Type, Provider Id];;
'Project ['Provider Id]

It selected and shows the Type column just fine but couldn't get it to work for the Provider Id. I wondered if it were because the column name had a space, so I tried using backticks, removing and replacing the space, but nothing seemed to work. Also, it ran fine when I'm using Spark libraries 3.x but doesn't work when I'm using Spark 2.1.x (meanwhile I need to use 2.1.x)

Additional: I tried changing the CSV column order from Type - Provider Id to Provider Id then Type. The error was the opposite, Provider Id shows but for Type it's throwing an exception now.

Any suggestions?

Scala Spark - Cannot resolve a column name

Answers (1)

Related Questions