wheels
wheels

Reputation: 109

Pyspark RDD: convert to string

Using rddfloat = rdd.map( (float(x[0]), float(x[1])) ), I converted the columns of an rdd into floats so that I could do math with them. Now I'm finished with the math and I want to convert them back into their original StringType.

I've tried rddstr = rddfloat( (str(x[0]), str(x[1]), str(x[2])) ), and it does return a string '40.745555', but that's not the same as the original rdd u'40.745555'. What is the difference between these, and how can I convert it back to how it was originally?

Upvotes: 1

Views: 10345

Answers (1)

Markon
Markon

Reputation: 4600

I assume you are using Python 2.X. This means that if you want to produce a unicode string, you need to call unicode, like

rddstr = rddfloat( (unicode(x[0]), str(x[1]), str(x[2])) )

However, to have a better understanding of the differences, I would suggest you to search online, because it's a pretty common question. For example, some of the answers reported in the following questions might sound reasonable for you:

In particular, this answer might help you: https://stackoverflow.com/a/18034409/126125

Upvotes: 3

Related Questions