Reputation: 9
I am trying to add coordinates to each pixel of an image for this I am doing the following
import cv2
import numpy as np
img = cv2.imread('images/0001.jpg')
grayscale = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
np_grayscale = np.array(grayscale)
# make the array 3d
processed_image = np_grayscale[:, :, np.newaxis]
x = 0
y = 0
for pixel_line in reversed(processed_image):
for pixel in pixel_line:
pixel = np.append(pixel, [x, y])
x += 1
y += 1
print(processed_image)
But this does not seem to work because I am still getting the original array that is in the form
[[[255]
[255]
[255]
...
[255]
[255]
[255]]
...
...
...
[255]
[255]
[255]]]
Moreover I don't think this is the most efficient way of doing this because I read that append creates a new copy of the array, can someone please help
Upvotes: 0
Views: 959
Reputation: 8112
Use xarray
.
The easiest way, by far, to add coordinates to a NumPy array is with an xarray.DataArray
. For example, here's how you can add coordinates that are just row and column index:
import xarray as xr
data = np.arange(25).reshape(5, 5)
rows, cols = data.shape
xs = np.arange(cols)
ys = np.arange(rows)
da = xr.DataArray(data, name="mydata", coords=[ys, xs], dims=['y', 'x'])
This da
thing is a DataArray
, which is essentially an indexed array ('indexed' in a pandas
sense: the indices can be integers, floats, dates, etc). It has some nice features (try da.plot()
). Otherwise, it basically behaves like a NumPy array. For example, you can slice it like a normal array:
>>> da[:, 3:]
<xarray.DataArray 'mydata' (y: 5, x: 2)>
array([[ 3, 4],
[ 8, 9],
[13, 14],
[18, 19],
[23, 24]])
Coordinates:
* y (y) int64 0 1 2 3 4
* x (x) int64 3 4
As you can see, this subarray 'knows' its own coordinates. What's more, you can have as many dimensions as you want, and each axis can have multiple coordinates. It's very useful.
As others pointed out, you can probably achieve what you want with careful indexing alone, but if you really want coordinates this is how I'd do it.
Upvotes: 1
Reputation: 36719
You can create a mesh of indices using meshgrid()
and stack()
them with the original image:
import numpy as np
x = np.asarray([
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 255, 0, 0],
[0, 0, 0, 0],
])
indices = np.meshgrid(
np.arange(x.shape[0]),
np.arange(x.shape[1]),
sparse=False
)
x = np.stack((x, *indices)).T
# array([[[ 0, 0, 0],
# [ 0, 0, 1],
# [ 0, 0, 2],
# [ 0, 0, 3]],
# [[ 0, 1, 0],
# [ 0, 1, 1],
# [255, 1, 2],
# [ 0, 1, 3]],
# [[ 0, 2, 0],
# [ 0, 2, 1],
# [ 0, 2, 2],
# [ 0, 2, 3]],
# [[ 0, 3, 0],
# [ 0, 3, 1],
# [ 0, 3, 2],
# [ 0, 3, 3]]])
x[0, 0, :] # 0, 0, 0
x[1, 2, :] # 255, 1, 2
x[-1, -1, :] # 0, 3, 3
Upvotes: 1
Reputation: 1420
As you have a grayscale image, you can convert it to the 3-channeled image in which the first channel will contain the pixel values, and the other 2 channels will contain the coordinates. So, if you split your image into two or more, you will still have the coordinates of the original image in the other two channels, and also to visualize the image, you can simply use only the first channel. Here's how you can do this.
processed_image = np.zeros((grayscale.shape[0], grayscale.shape[1], 3), dtype=np.uint64)
processed_image[:, :, 0] = np.asarray(grayscale, dtype=np.uint64)
for i in range(processed_image.shape[0]):
for j in range(processed_image.shape[1]):
processed_image[i][j][1] = i
processed_image[i][j][2] = j
print(processed_image)
# Displaying the image
cv2.imshow("img", np.array(processed_image[:, :, [0]], dtype=np.uint8))
cv2.waitKey(0)
Note: Take care of the datatypes of the numpy arrays. You cannot store the coordinates of the complete image in the np.uint8
array as it will only contain values from 0-255. Also, while displaying, you'll have to convert the first channel back to the np.uint8
datatype as OpenCV only understands images in this format (for integer pixel values). That is why I have used an array of np.uint64
datatype to store pixel values.
Upvotes: 0