Reputation: 2476
I think this is a pretty simple question but I wasn't able to find an answer.
I have an array:
array([ 62519, 261500, 1004836, ... , 0, 0])
I would like to convert it to a normal distribution with a min of 0 and a max of 1.
Any suggestions? I was looking at sklearn.preprocess.normalize, but was unable to get it to work for me.
The purpose is that I am creating a scatterplot with numpy, and want to use this third variable to color each point. However, the colors have to be between 0 and 1, and because I have some weird outliers I figured a normal distribution would be a good start.
Let me know if this doesn't make any sense. Thanks & Cheers.
Upvotes: 2
Views: 16445
Reputation: 101
I do not recommend using Standard Normal Distribution for normalization, please consider using frobenius/l2:
normalized_z = z / np.linalg.norm(z)
normalized_z = z / math.sqrt(max(sum(z**2), 1e-12)) # L2: Matrix Norm
normalized_z = tf.nn.l2_normalize(z,0)
Upvotes: 1
Reputation: 2476
Oh I'm an idiot, I just wanted to standardize it and can just do z = (x- mean)/std
. Sorry.
Upvotes: 10