Reputation: 3179
I am using spark-sql-2.4.1v with Java 8 in my project.
I need to construct a loop up hashmap from given dataframe as below:
List ll = Arrays.asList(
("aaaa", 11),
("aaa", 12),
("aa", 13),
("a", 14)
)
Dataset<Row> codeValudeDf = ll.toDF( "code", "value")
Given the above dataframe I need to create a hashmap
i.e.
Map<String, String> lookUpHm = new Hashmap<>();
lookUpHm => aaaa->11 , aaa->12 , aa->13, a->14
How can it it be done in Java?
Upvotes: 0
Views: 4303
Reputation: 6323
Try this-
List<Row> rows = Arrays.asList(
RowFactory.create("aaaa", 11),
RowFactory.create("aaa", 12),
RowFactory.create("aa", 13),
RowFactory.create("a", 14)
);
Dataset<Row> codeValudeDf = spark.createDataFrame(rows, new StructType()
.add("code", DataTypes.StringType, true, Metadata.empty())
.add("value", DataTypes.IntegerType, true, Metadata.empty()));
Map<String, Integer> map = new HashMap<>();
codeValudeDf.collectAsList().forEach(row -> map.put(row.getString(0), row.getInt(1)));
System.out.println(map.entrySet().stream().map(e -> e.getKey() +"->"+ e.getValue())
.collect(Collectors.joining(", ", "[ ", " ]")));
// [ aaa->12, aa->13, a->14, aaaa->11 ]
Upvotes: 2
Reputation: 695
Simple add a new column of type map using withColumn and do a collect on your dataframe.
codeValudeDf.withColumn("some_map",
map(col("code"), col("value"))).select("some_map").distinct().collect()
Upvotes: 1