Reputation: 11
I have a dataframe df_one, df_two like below:
df_one.show()
-------------
|Column_Name|
-------------
|NAME |
|ID |
|COUNTRY |
-------------
df_two.show()
-------------
|_c0|_c1|_c2|
-------------
|AAA|001|US |
|BBB|002|UK |
|CCC|003|IN |
|DDD|004|FR |
-------------
I am trying to rename the column of dataframe df_two like below:
------------- ----
|NAME|ID |COUNTRY|
------------------
|AAA |001| US |
|BBB |002| UK |
|CCC |003| IN |
|DDD |004| FR |
------------------
for time being i created seq and getting the above result
val newColumn = Seq("NAME", "ID", "COUNTRY")
val df = df_two.toDF(newColumn:_*)
But now I have to read column(Column_Name) from df_one and rename the column name of dataframe df_two respectively.
I also tried to read the column value from df_one but its returning Seq[Any] and i need Seq[String] .
guide me with some code here ..
Upvotes: 1
Views: 327
Reputation: 22439
Here's a solution in Scala.
Since df_one
is a small dataset (even if total number of columns is in thousands), one can collect
the DataFrame as an Array
. Now, collect
-ing the DataFrame would result in an Array
of Row
s:
df_one.collect
// res1: Array[org.apache.spark.sql.Row] = Array([NAME], [ID], [COUNTRY])
To unwrap the Row
s (of a single String
), simply apply Row
method getString
:
df_one.collect.map(_.getString(0))
// res2: Array[String] = Array(NAME, ID, COUNTRY)
Putting it altogether:
val df_one = Seq(
"NAME", "ID", "COUNTRY"
).toDF("Column_Name")
val df_two = Seq(
("AAA", "001", "US"),
("BBB", "002", "UK"),
("CCC", "003", "IN"),
("DDD", "004", "FR")
).toDF("_c0", "_c1", "_c2")
val colNames = df_one.collect.map(_.getString(0))
df_two.toDF(colNames: _*).show
// +----+---+-------+
// |NAME| ID|COUNTRY|
// +----+---+-------+
// | AAA|001| US|
// | BBB|002| UK|
// | CCC|003| IN|
// | DDD|004| FR|
// +----+---+-------+
Upvotes: 2