Reputation: 349
So I decided to get a little more into numpy but my original data comes from a dataframe, let's say for now this is the dataframe:
df = pd.DataFrame({
'col1': [101, 200, 306, 402, 500, 600],
'col2': [100, 200, 300, 400, 500, 600]})
I want to perform some basic column based calculations and save then in the same order inside of a numpy array, so I turn it into a numpy 2d array like this:
arr = np.array(df['col1'] - df['col2']).reshape((-1, 1))
# out
[[1]
[0]
[6]
[2]
[0]
[0]]
But then let's say my dataframe updates and the values of col1
become col2
and so the new values are added to col1
and zeroes are added to col2
if the value didn't exist before:
df = pd.DataFrame({
'col1': [103, 220, 316, 406, 501, 606, 348],
'col2': [101, 200, 306, 402, 500, 600, 0]})
So now instead of 6 values I have 7 which is where the complications start, since I want to calculate this difference as well and append it in the order it is to the array, so I tried to do this:
arr1 = np.array(df['col1'] - df['col2']).reshape((-1, 1))
arr = np.append(arr, np.zeros((len(arr1 - arr), arr.shape[0])), axis=1)
In order to fill the missing values and allow for concentration of both arrays, but it throws
a : ValueError: operands could not be broadcast together with shapes (7,1) (6,1)
I appreciate any help!
Full code and expected output
df = pd.DataFrame({
'col1': [101, 200, 306, 402, 500, 600],
'col2': [100, 200, 300, 400, 500, 600]})
arr = np.array(df['col1'] - df['col2']).reshape((-1, 1))
df = pd.DataFrame({
'col1': [103, 220, 316, 406, 501, 606, 348],
'col2': [101, 200, 306, 402, 500, 600, 0]})
arr1 = np.array(df['col1'] - df['col2']).reshape((-1, 1))
arr = np.append(arr, np.zeros((len(arr1 - arr), arr.shape[1])), axis=0)
arr = np.concatenate((arr, arr1), axis=1)
##EXPECTED##
[[1 2]
[0 20]
[6 10]
[2 4]
[0 1]
[0 6]
[0 348]]
Upvotes: 0
Views: 299
Reputation: 19332
Try this instead of the np.append
-
np.zeros((difference in shape[0], arr.shape[1]))
np.vstack
the arr and the zerosarr = np.vstack([arr, np.zeros((arr1.shape[0] - arr.shape[0], arr.shape[1]))]) #<--------
arr = np.concatenate((arr, arr1), axis=1)
print(arr)
# [[1 2]
# [0 20]
# [6 10]
# [2 4]
# [0 1]
# [0 6]
# [0 348]]
Upvotes: 1