NKijak
NKijak

Reputation: 1172

Spark dataset unzip function

I have a Dataset[(A,B)]. I'm looking for something like unzip(Set[A,B]) => (Set[A], Set[B]). What are my options? I'm not finding anything in the Dataset API. Do I need to drop down to RDDs and bring it back up?

This is caused by a join, are joins 'cheap' enough to do the join twice, just in reverse? Seems excessive since the two sets are there already.

Upvotes: 2

Views: 633

Answers (1)

NKijak
NKijak

Reputation: 1172

One solution, which should have been obvious I guess, is simply doing two steps of val a = ds.map(_._1) val b = ds.map(_._2)

Upvotes: 2

Related Questions