How to update a DataFrame given another DataFrame with the values for update?

Question

I'm using Zeppelin 0.6.2 and Spark 2.0.

I'm trying execute a query inside of a loop and it's not very effective.

I need to loop for each row of a dataframe, around 5000 rows and execute a query which will increment a value in another dataframe.

Here's my try at it:

val t2 = time
t2.registerTempTable("t2")
u.collect().foreach{ r => 
println(r(0))
val c=r(1)
val start="\""+r(2)+"\""
val end="\""+r(3)+"\""
sql("INSERT INTO TABLE t2 SELECT time, recordings + "+c+" AS recordings FROM time WHERE time >= " + start + " AND time < " + end)
}

I tried taking a tiny portion of the two dataframes but it is still really slow. I feel like I'm not doing this right.

Any idea how I can quickly update a dataframe?

How to update a DataFrame given another DataFrame with the values for update?

Answers (1)

Related Questions