Reputation: 565
I've the Hive table data, i would need some help on transforming the data into a below shown "Expected data form".
Two things on the note:
Is there a way i can achieve this using Spark-sql, or do i need write some scala UDF. I need the solution in spark-scala. Any help would be much appreciated.
Hive Table data:
_______________________________________________________________________________________________________
| vin | tt |msg_type | abs |ada | adw | alt |
|___________________|_____|_________|_______|________________________|_______|__________________________|
| FU7XXXXXXXXXXXXXX | 0 |SIGNAL | (null)|(null) | (null)|[{"E":15XXXXXXXX,"V":0.0}]|
|__________________ |_____|_________|______ |________________________|_______|__________________________|
| FSXXXXXXXXXXXXXXX | 0 |SIGNAL | (null)|[{"E":15XXXXXXXX,"V":1}]| (null)| (null) |
|___________________|_____|_________|_______|________________________|_______|__________________________|
Expected data:
_______________________________________________________________________
| vin | tt |msg_type | EVENTS |
|___________________|_____|_________|______________________________________|
| FU7XXXXXXXXXXXXXX | 0 |SIGNAL | [{"SN":"alt","E":15XXXXXXXX,"V":0.0}]|
|__________________ |_____|_________|______ _______________________________|
| FSXXXXXXXXXXXXXXX | 0 |SIGNAL | [{"SN":"ada","E":15XXXXXXXX,"V":1}] |
|___________________|_____|_________|______________________________________|
Upvotes: 0
Views: 243
Reputation: 2431
I have updated your input data and also declaring one String variable to make UDF generic.
scala> df.show(false)
+-----------------+---+--------+--------------------------+--------------------------+--------------------------+--------------------------+
|vin |tt |msg_type|abs |ada |adw |alt |
+-----------------+---+--------+--------------------------+--------------------------+--------------------------+--------------------------+
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |null |null |null |[{"E":15XXXXXXXX,"V":0.0}]|
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |null |[{"E":15XXXXXXXX,"V":1}] |null |null |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"E":15XXXXXXXX,"V":2}] |null |null |null |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |null |null |[{"E":15XXXXXXXX,"V":3}] |null |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |null |null |[{"E":15XXXXXXXX,"V":4.1}]|[{"E":15XXXXXXXX,"V":4.2}]|
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"E":15XXXXXXXX,"V":5.1}]|null |[{"E":15XXXXXXXX,"V":5.2}]|null |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"E":15XXXXXXXX,"V":6.1}]|[{"E":15XXXXXXXX,"V":6.2}]|[{"E":15XXXXXXXX,"V":6.3}]|null |
+-----------------+---+--------+--------------------------+--------------------------+--------------------------+--------------------------+
//String of column name that we need to check for null in Event
scala> val SingalColumns = "abs,ada, adw,alt"
SingColumns: String = abs,ada, adw,alt
//UDF declaration
scala> def EventUDF:UserDefinedFunction = udf((flagCol:String, r:Row) => {var signal = ""
| val flagColList:List[String] = flagCol.reverse.split(s""",""").map(x => x.trim).mkString(",").reverse.split(s",").toList
| flagColList.foreach{ x =>
| if (r.getAs(x) != null){
| signal= signal + "," + """{"SN":""" + x.toString +"," + r.getAs(x).toString.replaceFirst("\\[\\{","").replaceFirst("\\]","")
| }
| }
| signal.replaceFirst(s""",""", """[""").concat("]")
| })
//final DataFrame
scala> df.withColumn("Event", EventUDF(lit(SingalColumns),struct(df.columns map col: _*))).select("vin","tt","msg_type","Event").show(false)
+-----------------+---+--------+-------------------------------------------------------------------------------------------------------+
|vin |tt |msg_type|Event |
+-----------------+---+--------+-------------------------------------------------------------------------------------------------------+
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"SN":alt,"E":15XXXXXXXX,"V":0.0}] |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"SN":ada,"E":15XXXXXXXX,"V":1}] |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"SN":abs,"E":15XXXXXXXX,"V":2}] |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"SN":adw,"E":15XXXXXXXX,"V":3}] |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"SN":adw,"E":15XXXXXXXX,"V":4.1},{"SN":alt,"E":15XXXXXXXX,"V":4.2}] |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"SN":abs,"E":15XXXXXXXX,"V":5.1},{"SN":adw,"E":15XXXXXXXX,"V":5.2}] |
|FU7XXXXXXXXXXXXXX|0 |SIGNAL |[{"SN":abs,"E":15XXXXXXXX,"V":6.1},{"SN":ada,"E":15XXXXXXXX,"V":6.2},{"SN":adw,"E":15XXXXXXXX,"V":6.3}]|
+-----------------+---+--------+-------------------------------------------------------------------------------------------------------+
Upvotes: 1