Reputation: 8345
I've labeled an image to produce a numpy array with labels e.g.
array([[0, 1, 0, ..., 0, 0, 0],
[0, 1, 0, ..., 0, 0, 0],
[0, 1, 0, ..., 0, 0, 0],
...,
[0, 0, 0, ..., 0, 0, 0],
[2, 2, 0, ..., 0, 0, 0],
[2, 2, 0, ..., 0, 0, 0]], dtype=uint8)}
what is the most efficient way to turn this into the dataset:
x-coord | y-coord | label
-------------------------
0 | 0 | 0
0 | 1 | 1
0 | 2 | 0
...
1024 | 0 | 2
1024 | 1 | 2
etc.
I don't mind what the output format is but I expect a dictionary would be most convenient.
This is my current slow code that iterates through coordinates of the image:
(x, y) = img.shape
for x1, x2 in np.ndindex((x, y)):
data[(x1, x2)] = img[x1, x2]
The reason I'm doing this is that I would like to add other features to an array for each pixel.
Upvotes: 0
Views: 1028
Reputation: 221594
You could use np.meshgrid
and np.vstack
to create a Nx3
numpy array having a similar format as stated as desired in a vectorized manner, like so -
In [103]: img
Out[103]:
array([[0, 1, 1, 0, 0],
[0, 1, 0, 0, 1],
[1, 1, 1, 1, 2],
[2, 1, 1, 0, 2]])
In [104]: M,N = img.shape
In [105]: Y,X = np.meshgrid(np.arange(N),np.arange(M))
In [106]: np.vstack((X,Y,img)).reshape(3,-1).T
Out[106]:
array([[0, 0, 0],
[0, 1, 1],
[0, 2, 1],
[0, 3, 0],
[0, 4, 0],
[1, 0, 0],
[1, 1, 1],
[1, 2, 0],
[1, 3, 0],
[1, 4, 1],
[2, 0, 1],
[2, 1, 1],
[2, 2, 1],
[2, 3, 1],
[2, 4, 2],
[3, 0, 2],
[3, 1, 1],
[3, 2, 1],
[3, 3, 0],
[3, 4, 2]])
Upvotes: 1