van_d39
van_d39

Reputation: 775

String (with fraction) to Double in Spark

I have a column in my DataFrame which contains values like 99 230/256. It's a String with a fraction. It's double representation is 99.8984375.

How do I apply a transformation which converts such Strings to Double in Spark? I'm using Spark 1.6.2

Upvotes: 0

Views: 418

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191874

Note: You have to define your own function to apply to the data. Spark just uses that, it isn't a built-in feature to do what you are asking.

Since you didn't specify which API you are using, here's a Python answer over a simple collection.

Also, you can run and test this completely outside of Spark.

def convertFrac(frac):
    parts = frac.split()
    whole = numer = 0
    denom = 1
    if len(parts) == 2:
        whole = float(parts[0]) 
        numer, denom = map(float, parts[1].split('/'))
    elif len(parts) == 1:
        if '/' in parts[0]:
            numer, denom = map(float, parts[0].split('/'))
        else:
            return float(parts[0])
    return whole + (numer / denom)

Here's a sample run

>>> sc.parallelize(["99 230/256", "1/100"]).map(convertFrac).collect()
[99.8984375, 0.01]

Warning, this doesn't work on all inputs (especially negatives like "-2 3/5" needs to be written as "-2 -3/5") - it is only an example of what you need to do.

Upvotes: 2

Related Questions