BdEngineer
BdEngineer

Reputation: 3179

Constructing Java hashmap from Spark dataframe

I am using spark-sql-2.4.1v with Java 8 in my project.

I need to construct a loop up hashmap from given dataframe as below:

List ll = Arrays.asList(
      ("aaaa", 11),
      ("aaa", 12),
      ("aa", 13),
      ("a", 14)
    )

Dataset<Row> codeValudeDf = ll.toDF( "code", "value")

Given the above dataframe I need to create a hashmap

i.e.

Map<String, String> lookUpHm = new Hashmap<>();

lookUpHm  => aaaa->11  , aaa->12 , aa->13, a->14

How can it it be done in Java?

Upvotes: 0

Views: 4303

Answers (2)

Som
Som

Reputation: 6323

Try this-

 List<Row> rows = Arrays.asList(
                RowFactory.create("aaaa", 11),
                RowFactory.create("aaa", 12),
                RowFactory.create("aa", 13),
                RowFactory.create("a", 14)
        );

        Dataset<Row> codeValudeDf = spark.createDataFrame(rows, new StructType()
                .add("code", DataTypes.StringType, true, Metadata.empty())
                .add("value", DataTypes.IntegerType, true, Metadata.empty()));
        Map<String, Integer> map = new HashMap<>();
        codeValudeDf.collectAsList().forEach(row -> map.put(row.getString(0), row.getInt(1)));

        System.out.println(map.entrySet().stream().map(e -> e.getKey() +"->"+ e.getValue())
                .collect(Collectors.joining(", ", "[ ", " ]")));
        // [ aaa->12, aa->13, a->14, aaaa->11 ]

Upvotes: 2

code.gsoni
code.gsoni

Reputation: 695

Simple add a new column of type map using withColumn and do a collect on your dataframe.

codeValudeDf.withColumn("some_map",
map(col("code"), col("value"))).select("some_map").distinct().collect()

Upvotes: 1

Related Questions