Reputation: 21
first array: var keyColumns = "A,B".split(",")
second array: var colValues = DataFrameTest.select("Y","Z").collect.map(row => row.toString) colValues: Array[String]= Array([1,2],[3,4],[5,6])
I want something as a result like: Array([A=1,B=2],[A=3,B=4],[A=5,B=6])
so that later I can iterate over this Array and can create my where clause like where (A=1 AND B=2) OR (A=3 AND B=4) OR (A=5 AND B=6)
Upvotes: 0
Views: 168
Reputation: 40510
First, don't convert structured data to string. Do .map(_.toSeq)
after collect, not toString
.
Then, something like this should work:
colValues
.map { _ zip keyColumns }
.map { _.map { case (v,k) => s"$k=$v" } }
.map { _.mkString("(", " AND ", ")") }
.mkString(" OR ")
You may find it helpful to run this step-by-step in REPL and see what each line does.
Upvotes: 1
Reputation: 591
you can use regex expression, like:
scala> val keyColumns = "A,B".split(",")
keyColumns: Array[String] = Array(A, B)
scala> val colValues = "[1,2] [3,4] [5,6]".split(" ")
colValues: Array[String] = Array([1,2], [3,4], [5,6])
scala> val pattern = """^\[(.{1}),(.{1})\]$""".r //here, (.{1}) determines a regex group of exactly 1 any char
pattern: scala.util.matching.Regex = ^\[(.{1}),(.{1})\]$
scala> colValues.map { e => pattern.findFirstMatchIn(e).map { m => s"(${keyColumns(0)}=${m.group(1)} AND ${keyColumns(1)}=${m.group(2)})" }.getOrElse(e) }.mkString(" OR ")
res0: String = (A=1 AND B=2) OR (A=3 AND B=4) OR (A=5 AND B=6)
Upvotes: 0