Reputation: 9

Add co-ordinates to each pixel in python

I am trying to add coordinates to each pixel of an image for this I am doing the following

import cv2
import numpy as np

img = cv2.imread('images/0001.jpg')
grayscale = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

np_grayscale = np.array(grayscale)

# make the array 3d
processed_image = np_grayscale[:, :, np.newaxis]

x = 0
y = 0
for pixel_line in reversed(processed_image):
    for pixel in pixel_line:
        pixel = np.append(pixel, [x, y])
        x += 1
    y += 1

print(processed_image)

But this does not seem to work because I am still getting the original array that is in the form

[[[255]
  [255]
  [255]
  ...
  [255]
  [255]
  [255]]

 ...
...
  ...
  [255]
  [255]
  [255]]]

Moreover I don't think this is the most efficient way of doing this because I read that append creates a new copy of the array, can someone please help

Upvotes: 0

Answers (3)

Matt Hall

Reputation: 8152

Use xarray.

The easiest way, by far, to add coordinates to a NumPy array is with an xarray.DataArray. For example, here's how you can add coordinates that are just row and column index:

import xarray as xr

data = np.arange(25).reshape(5, 5)
rows, cols = data.shape
xs = np.arange(cols)
ys = np.arange(rows)
da = xr.DataArray(data, name="mydata", coords=[ys, xs], dims=['y', 'x'])

This da thing is a DataArray, which is essentially an indexed array ('indexed' in a pandas sense: the indices can be integers, floats, dates, etc). It has some nice features (try da.plot()). Otherwise, it basically behaves like a NumPy array. For example, you can slice it like a normal array:

>>> da[:, 3:]
<xarray.DataArray 'mydata' (y: 5, x: 2)>
array([[ 3,  4],
       [ 8,  9],
       [13, 14],
       [18, 19],
       [23, 24]])
Coordinates:
  * y        (y) int64 0 1 2 3 4
  * x        (x) int64 3 4

As you can see, this subarray 'knows' its own coordinates. What's more, you can have as many dimensions as you want, and each axis can have multiple coordinates. It's very useful.

As others pointed out, you can probably achieve what you want with careful indexing alone, but if you really want coordinates this is how I'd do it.

Upvotes: 1

Nils Werner

Reputation: 36839

You can create a mesh of indices using meshgrid() and stack() them with the original image:

import numpy as np

x = np.asarray([
    [0, 0, 0, 0],
    [0, 0, 0, 0],
    [0, 255, 0, 0],
    [0, 0, 0, 0],
])

indices = np.meshgrid(
    np.arange(x.shape[0]),
    np.arange(x.shape[1]),
    sparse=False
)

x = np.stack((x, *indices)).T

# array([[[  0,   0,   0],
#         [  0,   0,   1],
#         [  0,   0,   2],
#         [  0,   0,   3]],

#        [[  0,   1,   0],
#         [  0,   1,   1],
#         [255,   1,   2],
#         [  0,   1,   3]],

#        [[  0,   2,   0],
#         [  0,   2,   1],
#         [  0,   2,   2],
#         [  0,   2,   3]],

#        [[  0,   3,   0],
#         [  0,   3,   1],
#         [  0,   3,   2],
#         [  0,   3,   3]]])

x[0, 0, :] # 0, 0, 0
x[1, 2, :] # 255, 1, 2
x[-1, -1, :] # 0, 3, 3

Upvotes: 1

Rahul Kedia

Reputation: 1430

As you have a grayscale image, you can convert it to the 3-channeled image in which the first channel will contain the pixel values, and the other 2 channels will contain the coordinates. So, if you split your image into two or more, you will still have the coordinates of the original image in the other two channels, and also to visualize the image, you can simply use only the first channel. Here's how you can do this.


processed_image = np.zeros((grayscale.shape[0], grayscale.shape[1], 3), dtype=np.uint64)
processed_image[:, :, 0] = np.asarray(grayscale, dtype=np.uint64)

for i in range(processed_image.shape[0]):
    for j in range(processed_image.shape[1]):
        processed_image[i][j][1] = i
        processed_image[i][j][2] = j

print(processed_image)

# Displaying the image
cv2.imshow("img", np.array(processed_image[:, :, [0]], dtype=np.uint8))
cv2.waitKey(0)

Note: Take care of the datatypes of the numpy arrays. You cannot store the coordinates of the complete image in the np.uint8 array as it will only contain values from 0-255. Also, while displaying, you'll have to convert the first channel back to the np.uint8 datatype as OpenCV only understands images in this format (for integer pixel values). That is why I have used an array of np.uint64 datatype to store pixel values.

Upvotes: 0

Add co-ordinates to each pixel in python

Answers (3)

Related Questions