PolarBear10
PolarBear10

Reputation: 2305

Scala Spark read last row under specific column only

How can I modify the below code to only fetch the last row in the table, specifically the value under the key column? The reason is, it is a huge table and I need the last row, specifically the key value, to know how much it loaded thus far. I do not care about what other contents there are.

Line 1:

val df = spark.sqlContext.read.format("datasource").option("project", "character").option("apiKey", "xx").option("type", "tables").option("batchSize", "10000").option("database", "humans").option("table", "healthGamma").option("inferSchema", "true").option("inferSchemaLimit", "1").load()

Line 2:

df.createTempView("tables")

Line 3:

spark.sqlContext.sql("select * from tables").repartition(1).write.option("header","true").parquet("lifes_remaining")

Upvotes: 0

Views: 1862

Answers (1)

Yash Shah
Yash Shah

Reputation: 125

you can use orderBy in a Dataframe like this, hope it helps:

df.orderBy($"value".desc).show(1) 

Upvotes: 1

Related Questions