min heo
min heo

Reputation: 151

how to solve this spark-scala sql error message

To remove duplicate rows, I attempt this sql

val characters = MongoSpark.load[sparkSQL.Character](sparkSession)
characters.createOrReplaceTempView("characters")
val testsql = sparkSession.select("SELECT * FROM characters GROUP BY title")
testsql.show()

but this sql make this error message. if you know this problem, please answer this questin.

thanks you

Parsing command: SELECT * FROM characters GROUP BY title
Exception in thread "main" org.spache.spark.sql.AnalysisException: 
expression 'characters.`url`' is neither present in the group by, nor is it an aggregate function
Add to Add to group by  or wrap in first() if you don't care which value you get.;;

and then i attempt like this but i don't know this is right solution....

please answer this question. thanks you!

val characters = MongoSpark.load[sparkSQL.Character](sparkSession)
characters.createOrReplaceTempView("characters")
val testsql = sparkSession.select("SELECT * FROM characters")
testgrsql = testsql.groupBy("title")
testgrsql.show()

Upvotes: 1

Views: 69

Answers (1)

mrsrinivas
mrsrinivas

Reputation: 35404

Error message explains everything,

Parsing command: SELECT * FROM characters GROUP BY title

Exception in thread "main" org.spache.spark.sql.AnalysisException: expression 'characters.url' is neither present in the group by, nor is it an aggregate function

Add to Add to group by or wrap in first() if you don't care which value you get.;;

So the usage can be, If you want first url value for each title then first(url)

characters.createOrReplaceTempView("characters")
val testsql = sparkSession.sql("SELECT title, first(url) FROM characters GROUP BY title")

Upvotes: 1

Related Questions