Reputation: 4083
I would like to correct the values in hyperspectral readings from a cameara using the formula described over here;
the captured data is subtracted by dark reference and divided with white reference subtracted dark reference.
In the original example, the task is rather simple, white and dark reference has the same shape as the main data so the formula is executed as:
corrected_nparr = np.divide(np.subtract(data_nparr, dark_nparr),
np.subtract(white_nparr, dark_nparr))
However the main data is much larger in my experience. Shapes in my case are as following;
$ white_nparr.shape, dark_nparr.shape, data_nparr.shape
((100, 640, 224), (100, 640, 224), (4300, 640, 224))
that's why I repeat
the reference arrays.
white_nparr_rep = white_nparr.repeat(43, axis=0)
dark_nparr_rep = dark_nparr.repeat(43, axis=0)
return np.divide(np.subtract(data_nparr, dark_nparr_rep), np.subtract(white_nparr_rep, dark_nparr_rep))
And it works almost perfectly, as can be seen in the image at the left. But this approach requires enormous amount of memory, so I decided to traverse the large array and replace the original values with corrected ones on-the-go instead:
ref_scale = dark_nparr.shape[0]
data_scale = data_nparr.shape[0]
for i in range(int(data_scale / ref_scale)):
data_nparr[i*ref_scale:(i+1)*ref_scale] =
np.divide
(
np.subtract(data_nparr[i*ref_scale:(i+1)*ref_scale], dark_nparr),
np.subtract(white_nparr, dark_nparr)
)
But that traversal approach gives me the ugliest of results, as can be seen in the right. I'd appreciate any idea that would help me fix this.
Note: I apply 20-times co-adding (mean of 20 readings) to obtain the images below.
EDIT: dtype
of each array is as following:
$ white_nparr.dtype, dark_nparr.dtype, data_nparr.dtype
(dtype('float32'), dtype('float32'), dtype('float32'))
Upvotes: 2
Views: 434
Reputation: 114921
Your two methods don't agree because in the first method you used
white_nparr_rep = white_nparr.repeat(43, axis=0)
but the second method corresponds to using
white_nparr_rep = np.tile(white_nparr, (43, 1, 1))
If the first method is correct, you'll have to adjust the second method to act accordingly. Perhaps
for i in range(int(data_scale / ref_scale)):
data_nparr[i*ref_scale:(i+1)*ref_scale] =
np.divide
(
np.subtract(data_nparr[i*ref_scale:(i+1)*ref_scale], dark_nparr[i]),
np.subtract(white_nparr[i], dark_nparr[i])
)
A simple example with 2-d arrays that shows the difference between repeat
and tile
:
In [146]: z
Out[146]:
array([[ 1, 2, 3, 4, 5],
[11, 12, 13, 14, 15]])
In [147]: np.repeat(z, 3, axis=0)
Out[147]:
array([[ 1, 2, 3, 4, 5],
[ 1, 2, 3, 4, 5],
[ 1, 2, 3, 4, 5],
[11, 12, 13, 14, 15],
[11, 12, 13, 14, 15],
[11, 12, 13, 14, 15]])
In [148]: np.tile(z, (3, 1))
Out[148]:
array([[ 1, 2, 3, 4, 5],
[11, 12, 13, 14, 15],
[ 1, 2, 3, 4, 5],
[11, 12, 13, 14, 15],
[ 1, 2, 3, 4, 5],
[11, 12, 13, 14, 15]])
Off topic postscript: I don't know why the author of the page that you linked to writes NumPy expressions as (for example):
corrected_nparr = np.divide(
np.subtract(data_nparr, dark_nparr),
np.subtract(white_nparr, dark_nparr))
NumPy allows you to write that as
corrected_nparr = (data_nparr - dark_nparr) / (white_nparr - dark_nparr)
whick looks much nicer to me.
Upvotes: 3