arkadiy
arkadiy

Reputation: 766

Concatenating two NumPy arrays gives "ValueError: all the input arrays must have same number of dimensions"

header output:

array(['Subject_ID', 'tube_label', 'sample_#', 'Relabel', 
      'sample_ID','cortisol_value', 'Group'], dtype='<U14')

body output:

array([['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
       ['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC'],], dtype=object)

testing = np.concatenate((header, body), axis=0)
ValueError                                Traceback (most recent call last) <ipython-input-302-efb002602b4b> in <module>()
      1 # Merge names and the rest of the data in np array
      2 
----> 3 testing = np.concatenate((header, body), axis=0)

ValueError: all the input arrays must have same number of dimensions

Might someone be able to troubleshoot this? I have tried different commands to merge the two (including stack) and am getting the same error. The dimensions (columns) do seem to be the same though.

Upvotes: 2

Views: 2153

Answers (3)

jpp
jpp

Reputation: 164623

You need to align array dimensions first. You are currently trying to combine 1-dimensional and 2-dimensional arrays. After alignment, you can use numpy.vstack.

Note np.array([A]).shape returns (1, 7), while B.shape returns (2, 7). A more efficient alternative would be to use A[None, :].

Also note your array will become of dtype object, as this will accept arbitrary / mixed types.

A = np.array(['Subject_ID', 'tube_label', 'sample_#', 'Relabel', 
              'sample_ID','cortisol_value', 'Group'], dtype='<U14')

B = np.array([['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
              ['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC'],], dtype=object)

res = np.vstack((np.array([A]), B))

print(res)

array([['Subject_ID', 'tube_label', 'sample_#', 'Relabel', 'sample_ID',
        'cortisol_value', 'Group'],
       ['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
       ['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC']], dtype=object)

Upvotes: 0

kmario23
kmario23

Reputation: 61325

You're right in trying to use numpy.concatenate() but you've to promote the first array to 2D before concatenating. Here's a simple example:

In [1]: import numpy as np

In [2]: arr1 = np.array(['Subject_ID', 'tube_label', 'sample_#', 'Relabel', 
   ...:       'sample_ID','cortisol_value', 'Group'], dtype='<U14')
   ...:       

In [3]: arr2 = np.array([['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
   ...:        ['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC'],], dtype=object)
   ...:        

In [4]: arr1.shape
Out[4]: (7,)

In [5]: arr2.shape
Out[5]: (2, 7)

In [8]: concatenated = np.concatenate((arr1[None, :], arr2), axis=0)

In [9]: concatenated.shape
Out[9]: (3, 7)

And the resultant concatenated array would look like:

In [10]: concatenated
Out[10]: 
array([['Subject_ID', 'tube_label', 'sample_#', 'Relabel', 'sample_ID',
        'cortisol_value', 'Group'],
       ['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
       ['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC']], dtype=object)

Explanation:

The reason you were getting the ValueError is because one of the arrays is 1D while the other is 2D. But, numpy.concatenate expects the arrays to be of same dimension in this case. That's why we promoted the array dimension of arr1 using None. But, you can also use numpy.newaxis in place of None

Upvotes: 3

JoshuaF
JoshuaF

Reputation: 1216

Look at numpy.vstack and hstack, as well as the axis argument in np.append. Here it looks like you want vstack (i.e. the output array will have 3 columns, each with the same number of rows). You can also look into numpy.reshape, to change the shape of the input arrays so you can concatenate them.

Upvotes: 0

Related Questions