Reputation: 265
I have below like data in dataframe. Note that - Contents is the only one column and this dataframe has only one record which has the data. In data, first row is header, lines are separated by LF.
How can I generate a new dataframe which will have 3 columns and corresponding data.
display(df)
Contents
============================
"DateNum","MonthNum","DayName"
"19910101","1","Tue"
"19910102","1","Wed"
"19910103","1","Thu"
Just for info, below is how the data looks
Upvotes: 1
Views: 377
Reputation: 42422
You can split by new line to get an RDD[String], which can then be converted to a dataframe:
val df2 = spark.read.option("header",true).csv(df.rdd.flatMap(_.getString(0).split("\n")).toDS)
df2.show
+--------+--------+-------+
| DateNum|MonthNum|DayName|
+--------+--------+-------+
|19910101| 1| Tue|
|19910102| 1| Wed|
|19910103| 1| Thu|
+--------+--------+-------+
Upvotes: 2