data_coder
data_coder

Reputation: 97

How to use min max scaler on numpy array in pyspark environment?

Here is the way I could do using sklearn minmax_scale, however sklearn can not be able to integrate with pyspark. Is there anyway, I could use an alternate way in spark for minmax scaling on an array? Thanks.

for i, a in enumerate(np.array_split(target, count)):    
        start = q_l[i]
        if i == (count - 1):
            end = 1.0
        else:
            end = q_l[i + 1]
        target_scaled = minmax_scale(a, feature_range=(start, end))
        result.append(target_scaled)
results = np.concatenate(results)

Upvotes: 1

Views: 219

Answers (0)

Related Questions