Converting values in each row into new column with spark

Question

I am working on creating a dataframe from a XML file using Spark in python. What I want to do is converting value in each row into new column and making dummy variable.

Here is the example.

Input:

 id  |         classes          |
-----+--------------------------+
 132 |  economics,engineering   |
 201 |  engineering             |
 123 |  sociology,philosophy    |
 222 |  philosophy              |
--------------------------------

Output:

 id  | economics | engineering | sociology | philosophy
-----+-----------+-------------+-----------+-----------
 132 |    1      |     1       |      0    |     0
 201 |    0      |     1       |      0    |     0
 123 |    0      |     0       |      1    |     1
 222 |    0      |     0       |      0    |     1
--------------------------------------------------------

Converting values in each row into new column with spark

Answers (1)

Related Questions