Reputation: 775
I have a column in my DataFrame which contains values like 99 230/256
. It's a String with a fraction. It's double representation is 99.8984375
.
How do I apply a transformation which converts such Strings to Double in Spark? I'm using Spark 1.6.2
Upvotes: 0
Views: 418
Reputation: 191874
Note: You have to define your own function to apply to the data. Spark just uses that, it isn't a built-in feature to do what you are asking.
Since you didn't specify which API you are using, here's a Python answer over a simple collection.
Also, you can run and test this completely outside of Spark.
def convertFrac(frac):
parts = frac.split()
whole = numer = 0
denom = 1
if len(parts) == 2:
whole = float(parts[0])
numer, denom = map(float, parts[1].split('/'))
elif len(parts) == 1:
if '/' in parts[0]:
numer, denom = map(float, parts[0].split('/'))
else:
return float(parts[0])
return whole + (numer / denom)
Here's a sample run
>>> sc.parallelize(["99 230/256", "1/100"]).map(convertFrac).collect()
[99.8984375, 0.01]
Warning, this doesn't work on all inputs (especially negatives like "-2 3/5"
needs to be written as "-2 -3/5"
) - it is only an example of what you need to do.
Upvotes: 2