Alasdair
Alasdair

Reputation: 14113

PHP - Really fast way of detecting white space around an image?

I need a really fast method of detecting white space around an image, specifically I need the coordinates of where the first non-white pixel begins for each side (e.g. left, top, right, bottom.)

ImageMagick is too slow, so is looping through each pixel along each side with GD and seeing if it's white. I've got to do about 500,000,000 images so every microsecond makes a difference.

BTW, the image is black & white only.

If there is an external app that can do this, which I can exec with PHP, then that would be fine.

Upvotes: 4

Views: 884

Answers (2)

KillerX
KillerX

Reputation: 1466

One algorithm that could work if you have a mostly continuos border of darker pixels: Left side:

  1. Take the middle pixel and start checking the pixels to the right until you find a black one.
  2. Then move from there up/down until you hit a black pixel
  3. When you find a black pixel move to the left until you hit a white one
  4. Repeat 2,3 until you end up at top/bottom

That of course will not work if there are gaps, like in text.

Upvotes: 1

Sam Holder
Sam Holder

Reputation: 32936

is there any extra information you know about the images that can be used to help?

Like do the images start white then go black then stay black? or can any pixel be white or black and the fact that any one is white or black doesn't tell you anything about the others?

If any pixel can be white or black regardless of the other pixels then I don't see how you can do much better than checking each pixel in a loop until you find the first non white one...

If you know that if the fifth pixel from the left is white then 0-4 are definitely white as well then you might be able to check fewer pixels instead by using some sort of modified binary type search (as you could skip checking 0-4 in this case and just check say 5, then 10 and if 5 is white and 10 is black you know the point is somewhere between 5-10, so then you can split the difference and check 7 etc etc until you find the point at which they change.)

I think you might end up with a trade off between speed and accuracy here. The most accurate way is to slice through each column and row, starting at the extremities, checking each pixel. Once you find a hit in a column you have found the edge on one side. This could be done in parallel as each check is independent. you might be able to speed this up by, as you said, checking only every nth pixel, but this is likely to come a cropper occasionally, especially with such a large data set. this may or may not be acceptable. you may be able to improve this by checking around the area you find a match to check that the match is accurate. So if you check every 3rd pixel and you find a hit at pixel 15 then check 14 to see it if is a hit (and 13 if 14 is). using this you might be able to get away with fewer checks.

Upvotes: 3

Related Questions