Reputation: 141
I enabled the following spark.sql session:
# creating Spark context and connection
spark = (SparkSession.builder.appName("appName").enableHiveSupport().getOrCreate())
and am able to produce see the results of the following query:
spark.sql("select year(plt_date) as Year, month(plt_date) as Mounth, count(build) as B_Count, count(product) as P_Count from first_table full outer join second_table on key1=CONCAT('SS',key_2) group by year(plt_date), month(plt_date)").show()
However, when I try to write the resulting dataframe from this query to hdfs, I get the following error:
I am able to save the resulting dataframe of a simple version of this query to the same path. The problem appears by adding functions such as count(), year() and etc.
What is the problem? and how can I save the results to hdfs?
Upvotes: 2
Views: 449
Reputation: 174
It is giving error due to '(' present in column 'year(CAST(plt_date AS DATE))' :
Use to rename :
data = data.selectExpr("year(CAST(plt_date AS DATE)) as nameofcolumn")
Upvote if works
Refer : Rename Spark Column
Upvotes: 3