Abhinav Bhardwaj
Abhinav Bhardwaj

Reputation: 167

Convert RDD of Lists to Dataframe

I am trying to convert an RDD of lists to a Dataframe in Spark.

RDD:

['ABC', 'AA', 'SSS', 'color-0-value', 'AAAAA_VVVV0-value_1', '1', 'WARNING', 'No test data for negative population! Re-using negative population for non-backtest.']
['ABC', 'SS', 'AA', 'color-0-SS', 'GG0-value_1', '1', 'Temp', 'After, date differences are outside tolerance (10 days) 95.1% of the time']

This is the content of the RDD, multiple lists.

How to convert this to a dataframe? Currently, it is converting it into a single column, but i need multiple columns.

Dataframe
+--------------+
|            _1|
+--------------+
|['ABC', 'AA...|
|['ABC', 'SS...|

Upvotes: 0

Views: 1953

Answers (1)

6fe2070c
6fe2070c

Reputation: 61

Just use Row.fromSeq:

import org.apache.spark.sql.Row

rdd.map(x => Row.fromSeq(x)).toDF

Upvotes: 6

Related Questions