Reputation: 329
What do i want - is to map column names into keys. For example:
#+-------+----------+
#|key1 |key2 |
#+-------+----------+
#|value1 |value2 |
#|value3 |value4 |
#+-------+----------+
Will be transformed to
#+-------+----------+
#| keys|values |
#+-------+----------+
#|key1 |value1 |
#|key1 |value2 |
#|key2 |value3 |
#|key2 |value4 |
#+-------+----------+
In HiveQL i can wrote something similar to
select distinct key, velue
from xxx
lateral view explode(map(
'key1', key1,
'key2', key2) tab as key, value
But how to wrote it at pyspark. I can use createtemptable but i think it's not best solution/
Upvotes: 1
Views: 2715
Reputation: 32680
Use create_map
function to create a Map column and then explode it.
create_map
needs a list of column expressions that are grouped
as key-value pairs. Such a list can be created using for comprehension on DataFrame columns:
from itertools import chain
from pyspark.sql.functions import col, lit, create_map, explode
data = [("value1", "value2"), ("value3", "value4")]
df = spark.createDataFrame(data, ["key1", "key2"])
key_values = create_map(*(chain(*[(lit(name), col(name)) for name in df.columns])))
df.select(explode(key_values)).show()
+----+------+
| key| value|
+----+------+
|key1|value1|
|key2|value2|
|key1|value3|
|key2|value4|
+----+------+
Upvotes: 6
Reputation: 15273
Something like that ?
select 'key1' as keys,
key1 as values
from xxx
union all
select 'key2' as keys,
key2 as values
from xxx
put that is spark.sql()
.
Upvotes: 1