How to use min max scaler on numpy array in pyspark environment?

Question

Here is the way I could do using sklearn minmax_scale, however sklearn can not be able to integrate with pyspark. Is there anyway, I could use an alternate way in spark for minmax scaling on an array? Thanks.

for i, a in enumerate(np.array_split(target, count)):    
        start = q_l[i]
        if i == (count - 1):
            end = 1.0
        else:
            end = q_l[i + 1]
        target_scaled = minmax_scale(a, feature_range=(start, end))
        result.append(target_scaled)
results = np.concatenate(results)

How to use min max scaler on numpy array in pyspark environment?

Answers (0)

Related Questions