Abir Chokraborty
Abir Chokraborty

Reputation: 1765

What is the efficient way to list the unique List[String]'s from an Array[List[String]]?

I want to find unique List[String] from an Array[List[String]].

For example, suppose we have an Array of the following List[String]

[a, b, c]
[a, b]
[a, b]
[a, c]

The expected result would be

[a, b, c]
[a, b]
[a, c]

Upvotes: 0

Views: 106

Answers (1)

USB
USB

Reputation: 6139

Yes. You can apply .distinct in Array(List(String))

def distinct: Array[List[String]]

Builds a new mutable indexed sequence from this mutable indexed sequence without any duplicate elements.

Returns

A new mutable indexed sequence which contains the first occurrence of every element of this mutable indexed sequence.

Try the below snippet

import org.apache.spark.sql.SparkSession

object StackTest {
  def main(args: Array[String]): Unit = {

    System.setProperty("hadoop.home.dir", "C:\\hadoop")
    val spark = SparkSession
      .builder()
      .config("spark.master", "local[1]")
      .appName("StackOverFlow")
      .getOrCreate()

    spark.sparkContext.setLogLevel("WARN")

    val hc = spark.sqlContext
    import spark.implicits._

    //Define Array[List[String]]
    var myArrList = Array(List("a","b","c"),List("a","b"),List("a","b"),List("a","c"))
    println("ArrayList: "+ myArrList.deep)

    var distinctMyArrList = myArrList.distinct
    println("Distinct ArrayList: "+ distinctMyArrList.deep)

  }
}

OUTPUT

ArrayList: Array(List(a, b, c), List(a, b), List(a, b), List(a, c))
Distinct ArrayList: Array(List(a, b, c), List(a, b), List(a, c))

Upvotes: 1

Related Questions