Reputation:
My goal is to write a function that (1) makes a copy of a numpy array, (2) modifies this copy, and (3) returns the modified copy. However, this doesn't work as I thought it would...
To show a simple example, let's assume I have a simple function for z-score normalization:
def standardizing1(array, columns, ddof=0):
ary_new = array.copy()
if len(ary_new.shape) == 1:
ary_new = ary_new[:, np.newaxis]
return (ary_new[:, columns] - ary_new[:, columns].mean(axis=0)) /\
ary_new[:, columns].std(axis=0, ddof=ddof)
And the results are what I expect:
>>> ary = np.array([[1, 10], [2, 9], [3, 8], [4, 7], [5, 6], [6, 5]])
>>> standardizing1(ary, [0, 1])
array([[-1.46385011, 1.46385011],
[-0.87831007, 0.87831007],
[-0.29277002, 0.29277002],
[ 0.29277002, -0.29277002],
[ 0.87831007, -0.87831007],
[ 1.46385011, -1.46385011]])
However, let's say I want to return a modified version of the copy. I am wondering why it doesn't work. For example,
def standardizing2(array, columns, ddof=0):
ary_new = array.copy()
if len(ary_new.shape) == 1:
ary_new = ary_new[:, np.newaxis]
ary_new[:, columns] = (ary_new[:, columns] - ary_new[:, columns].mean(axis=0)) /\
ary_new[:, columns].std(axis=0, ddof=ddof)
# some more processing steps with ary_new
return ary_new
>>> ary = np.array([[1, 10], [2, 9], [3, 8], [4, 7], [5, 6], [6, 5]])
>>> standardizing2(ary, [0, 1])
array([[-1, 1],
[ 0, 0],
[ 0, 0],
[ 0, 0],
[ 0, 0],
[ 1, -1]])
But if I assign it to a new array, without "slicing", it works again
def standardizing3(array, columns, ddof=0):
ary_new = array.copy()
if len(ary_new.shape) == 1:
ary_new = ary_new[:, np.newaxis]
some_ary = (ary_new[:, columns] - ary_new[:, columns].mean(axis=0)) /\
ary_new[:, columns].std(axis=0, ddof=ddof)
return some_ary
>>>> ary = np.array([[1, 10], [2, 9], [3, 8], [4, 7], [5, 6], [6, 5]])
>>> standardizing3(ary, [0, 1])
array([[-1.46385011, 1.46385011],
[-0.87831007, 0.87831007],
[-0.29277002, 0.29277002],
[ 0.29277002, -0.29277002],
[ 0.87831007, -0.87831007],
[ 1.46385011, -1.46385011]])
Upvotes: 1
Views: 107
Reputation: 280301
When you do
ary = np.array([[1, 10], [2, 9], [3, 8], [4, 7], [5, 6], [6, 5]])
You create an array of integer dtype. That means that
ary_new = array.copy()
is also an array of integer dtype. It cannot hold floating-point numbers; when you try to put floats into it:
ary_new[:, columns] = ...
they are automatically cast to integers.
If you want an array of floats, you would have to specify that when you create the array:
ary_new = array.astype(float)
Upvotes: 2