Reputation: 71
I try to split tuple of ints to two rows in RDD.
vertices=edges.map(lambda x:(x[0],)).union(edges.map(lambda x:(x[1],))).distinct()
I try this code and it is working, but I want code that run less in runtime, without using the GraphFrames package.
Upvotes: 0
Views: 414
Reputation: 3276
You can use flatMap
:
edges.flatMap(lambda x: x).distinct()
In Scala, you would simply call .flatMap(identity)
instead.
If you use the DataFrame
API you can just use explode
on your only column e.g. df.select(explode("edge"))
Upvotes: 1