Reputation: 13
So i have this project in Python (Computer Vision), which is seperating text from figures of an image (like a paper news image).
My question is what's the best way to detect those figures in the paper ? (in Python).
Paper image example : Paper .
Haven't try anything. I have no idea ..
Upvotes: 1
Views: 1401
Reputation: 17
import cv2
import numpy as np
# Read the image
image = cv2.imread('paper-news.png')
# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Blur the image
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
canny = cv2.Canny(blurred, 30, 150)
# Find contours in the image
contours, hierarchy = cv2.findContours(canny.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Iterate over the contours
for contour in contours:
# Get the rectangle bounding the contour
x,y,w,h = cv2.boundingRect(contour)
# Draw the rectangle
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
# Show the image
cv2.imshow('Image with Figures Detected', image)
cv2.waitKey(0)
this will help you.
Upvotes: 0
Reputation: 2340
I found layout-parser python toolkit which is very helpful for your project.
Layout Parser is a unified toolkit for Deep Learning Based Document Image Analysis.
With the help of Deep Learning, layoutparser supports the analysis very complex documents and processing of the hierarchical structure in the layouts.
Check this complete notebook example on detecting newspaper layouts (separating images and text regions on the newspaper image)
it's recommended to use Jupyter notebook on Linux or macOS because layout-parser isn't supported on windows OS, or you can use Google Colab which I used for direct running of the toolkit.
pip install layoutparser # Install the base layoutparser library with
pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit
pip install "layoutparser[ocr]" # Install OCR toolkit
Then installing the detectron2 model backend dependencies
pip install layoutparser torchvision && pip install "git+https://github.com/facebookresearch/[email protected]#egg=detectron2"
import layoutparser as lp
import cv2
# Convert the image from BGR (cv2 default loading style)
# to RGB
image = cv2.imread("test.jpg")
image = image[..., ::-1]
# Load the deep layout model from the layoutparser API
# For all the supported model, please check the Model
# Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html
model = lp.models.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config',
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.7],
label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})
# Detect the layout of the input image
layout = model.detect(image)
# Show the detected layout of the input image
lp.draw_box(image, layout, box_width=3)
From the result image you can see text layouts regions in orange box and image layouts regions (figure) in white box. It's amazing deep learning toolkit for image recognition.
Upvotes: 1
Reputation: 29
Detect text region in image using Opencv
Detecting and counting blobs/connected objects with opencv
Upvotes: 0
Reputation: 11
you can use image segmentation approach. Use connected components labelling algorithm so that all the text and images are detected as components. The components with larger area than a particular threshold can be detected as images in the paper. The connectedcomponentswithstats method can help to get components and get area of all components.
Hope this helps.
Upvotes: 1
Reputation: 3
I would get started with the OpenCV module in Python, as it has a lot of really useful tools for image recognition. I'll link it here:
https://pypi.org/project/opencv-python/
Got to the first link to download the module package, and then check out the github link if you need help or have any issues.
Upvotes: 0