Pedro Alves
Pedro Alves

Reputation: 1054

Convert Scala RDD Map Function to Pyspark

I'm trying to convert the following function from Scala to Pyspark::

DF.rdd.map(args => (args(0).toString, args.mkString("|"))).take(5)

For that, I am making the following map function:

DF.rdd.map(lambda line: ",".join([str(x) for x in line])).take(5)

But the Scala code gives me Array structure while in Python I am getting a delimited result.

How to convert the above scala code to python?

Upvotes: 0

Views: 255

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191963

Your scala code returns a 2 element list from args.

Your python code is returning a comma joined string

This would return the same thing

lambda args: [str(args[0]), "|".join(map(str, args))]

Upvotes: 1

Related Questions