Compare the data in an image and a file

Question

I have an image file with a graph in it. For instance: USA map

How can I efficiently extract this graph? Could you point me some examples/concepts?
I have a data file containing the edge details and I could plot them on a map to create an image (like the one shown in the link) but with a different background. How can I compare the two and say that there is a match/mismatch?

Any help/comments/thought is much appreciated.

mmgp · Accepted Answer

Since this can be hardly replicated for different images, this is only a overview of something that works for this specific image.

First convert your image to CMYK and consider the third channel ('Y') to binarize (80% of max, your vertices are relative yellow) to find the vertices:

enter image description here

Now consider the second channel ('M') to binarize (80% of max again) to find the edges:

enter image description here

Now if you consider each vertex as a connected component and each edge as connected components too, then you can construct a graph simply by considering the both images at the same time and taking into consideration which edges a given vertex touches.

You can now convert your input image to grayscale in order to find the text. In this simple image, some ad-hoc connected component analysis and simple threshold will give all the text:

enter image description here

If I run some pretty basic text recognition on this last image, I get:

Seattle
Chicago
Bay Area DC Metro NYC
Denver
Los
Angeles
Phoenix

Which is pretty nice since it found all the text. All that is left is giving names to your vertices in the already built graph. To do that, consider where a block of text appears and proceed to find the closest edge (first image, just a euclidean distance to the centroids).

If it matters, here is the code to obtain these results:

f = Import["https://i.sstatic.net/DP3la.png"]
cmyk = ColorSeparate[f, "CMYK"]
vertex = Binarize[cmyk[[3]], 0.8]   (* The first image *)
edge = Binarize[cmyk[[2]], 0.8]     (* The second image *)
nyctext = SelectComponents[
  DeleteSmallComponents[
   SelectComponents[Binarize[ColorConvert[f, "Grayscale"], 0.01], 
    Small]], "Length", #1 < 25 &]
alltext = ImageAdd[
  SelectComponents[
   ColorNegate[Binarize[ColorConvert[f, "Grayscale"], 0.5]], Small], 
    nyc_text]                       (* The last image *)
TextRecognize[alltext]              (* The text recognized *)

Compare the data in an image and a file

Answers (1)

Related Questions