Elie Asmar
Elie Asmar

Reputation: 3165

Iterate and change values of python numpy matrix columns

I have a numpy matrix containing numbers.

1,0,1,1
0,1,1,1
0,0,1,0
1,1,1,1

I would like to perform a Z-Score Normalization over each column; z_Score[y] = (y-mean(column))/sqrt(var) y being each element in the column, mean being the mean function, sqrt the squared root function and var the variance.

My Approach was the following:

x_trainT = x_train.T #transpose the matrix to iterate over columns
for item in x_trainT:
    m = item.mean()
    var = np.sqrt(item.var())
    item = (item - m)/var
x_train = x_trainT.T

I thought that upon iteration, each row is accessed by reference, (like in c# lists for instance), therefore allowing me to change the matrix values through changing row values.
However I was wrong, since the matrix keeps its original values intact.

Your help is appreciated.

Upvotes: 0

Views: 1164

Answers (2)

Guillem
Guillem

Reputation: 2647

I'd recommend you to avoid iterations when possible. You can compute the mean and std in a 'column wise' manner.

>>> import numpy as np
>>> x_train = np.random.random((5, 8))
>>> norm_x_train = (x_train  - x_train.mean(axis=0)) / x_train.std(axis=0)

Upvotes: 2

Daniel Nguyen
Daniel Nguyen

Reputation: 429

You'll likely have to index over row number:

x_trainT = x_train.T
for i in range(x_trainT.shape[0]):
    item = x_trainT[i]
    m = item.mean()
    sd = np.sqrt(item.var())
    x_trainT[i] = (item - m)/sd
x_trainT = x_train.T

Upvotes: 1

Related Questions