user2201536
user2201536

Reputation: 109

Consolidate each row of dataframe returning a dataframe into ouput dataframe

I am looking for help in a scenario where I have a scala dataframe PARENT. I need to

Final output required :
        id  parentname  queryid   name
        1   X           23       lobo
        1   X           45       sobo
        1   X           56       aobo
        2   Y           53       lama
        2   Y           67       dama
        2   Y           56       pama

Update1:

I tried using foreachpartition and use foreach internally to loop through each record and got below error.

error: Unable to find encoder for type org.apache.spark.sql.DataFrame. An implicit Encoder[org.apache.spark.sql.DataFrame] is needed to store org.apache.spark.sql.DataFrame instances in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
             falttenedData.map(row=>{


I need to do this with scalability plz. Any help is really appreciated. 

Upvotes: 1

Views: 139

Answers (2)

Fragan
Fragan

Reputation: 892

The solution is pretty straightforward, you just need to join your parentDF and your other one.

parentDF.join(
    otherDF,
    Seq("id"),
    "left"
)

As you're caring about scalability, In case your "otherDF" is quite small (it has less than 10K rows for example with 2-3 cols), you should consider using broadcast join : parentDF.join(broadcast(otherDF), Seq("id"), "left).

Upvotes: 2

Koedlt
Koedlt

Reputation: 5963

You can use the .join method on a dataframe for this one.

Some example code would be something like this:

val df = Seq((1, "X"), (2, "Y")).toDF("id", "parentname")
df.show
+---+----------+                                                                                                                                                                                                                                                                
| id|parentname|                                                                                                                                                                                                                                                                
+---+----------+                                                                                                                                                                                                                                                                
|  1|         X|                                                                                                                                                                                                                                                                
|  2|         Y|                                                                                                                                                                                                                                                                
+---+----------+

val df2 = Seq((1, 23, "lobo"), (1, 45, "sobo"), (1, 56, "aobo"), (2, 53, "lama"), (2, 67, "dama"), (2, 56, "pama")).toDF("id", "queryid",  "name")
df2.show
+---+-------+----+                                                                                                                                                                                                                                                              
| id|queryid|name|                                                                                                                                                                                                                                                              
+---+-------+----+                                                                                                                                                                                                                                                              
|  1|     23|lobo|                                                                                                                                                                                                                                                              
|  1|     45|sobo|                                                                                                                                                                                                                                                              
|  1|     56|aobo|                                                                                                                                                                                                                                                              
|  2|     53|lama|                                                                                                                                                                                                                                                              
|  2|     67|dama|                                                                                                                                                                                                                                                              
|  2|     56|pama|                                                                                                                                                                                                                                                              
+---+-------+----+

val output=df.join(df2, Seq("id"))
output.show
+---+----------+-------+----+                                                                                                                                                                                                                                                   
| id|parentname|queryid|name|                                                                                                                                                                                                                                                   
+---+----------+-------+----+                                                                                                                                                                                                                                                   
|  1|         X|     23|lobo|                                                                                                                                                                                                                                                   
|  1|         X|     45|sobo|                                                                                                                                                                                                                                                   
|  1|         X|     56|aobo|                                                                                                                                                                                                                                                   
|  2|         Y|     53|lama|                                                                                                                                                                                                                                                   
|  2|         Y|     67|dama|                                                                                                                                                                                                                                                   
|  2|         Y|     56|pama|                                                                                                                                                                                                                                                   
+---+----------+-------+----+

Hope this helps! :)

Upvotes: 1

Related Questions