Reputation: 57
I have wrote udf function that convert Map[String,String] values to String:
udf("mapToString", (input: Map[String,String]) => input.mkString(","))
spark-shell
give me error:
<console>:24: error: overloaded method value udf with alternatives:
(f: AnyRef,dataType: org.apache.spark.sql.types.DataType)org.apache.spark.sql.expressions.UserDefinedFunction <and>
...
cannot be applied to (String, Map[String,String] => String)
udf("mapToString", (input: Map[String,String]) => input.mkString(","))
Is any method to convert column of Map[String,String] values to string values? I need this conversion because i need save dataframe as csv file
Upvotes: 2
Views: 2181
Reputation: 41957
Assuming that you have a DataFrame
as
+---+--------------+
|id |map |
+---+--------------+
|1 |Map(200 -> DS)|
|2 |Map(300 -> CP)|
+---+--------------+
with the following schema
root
|-- id: integer (nullable = false)
|-- map: map (nullable = true)
| |-- key: string
| |-- value: string (valueContainsNull = true)
You can write a udf
which looks like :
def mapToString = udf((map: collection.immutable.Map[String, String]) =>
map.mkString.replace(" -> ", ","))
and use the udf
function with withColumn
API as
df.withColumn("map", mapToString($"map"))
you should have final DataFrame
where Map
is changed to String
+---+------+
|id |map |
+---+------+
|1 |200,DS|
|2 |300,CP|
+---+------+
root
|-- id: integer (nullable = false)
|-- map: string (nullable = true)
Upvotes: 4