Cross Validated Asked on November 21, 2021
I want to develop a model for cropping the equations from the Maths questions as people like me are struggling a lot for doing it manually for the research purpose. I want to know if we can do this? and if we can out of all the possible solutions out there for object recognition models, which one will produce the best results on Text images.
As there is tensorflow’s object recognition API, RCNN, Fast RCNN, Faster RCNN, YOLO (v-1,2,3,4,5).
An if there is any other , please do suggest. What I want to do is to detect the gray areas of equations in this image.
Note: The grey region shown in the image is for just demonstrating. My actual images are simple cropped questions from books with with background and black letters (most of the books)
Note that there are two problems in this case: segmentation and classification. A neural net might be a solution for both steps in this case because you can easily generate zillions of labelled test images. Nevertheless, a classic approach should yield comparable results with much less efforts:
Out of curiosity, I have tried out step one with the python library Gamera (gamera.sf.net) with the following code:
from gamera.core import *
init_gamera()
img = load_image("MathExpressionInputExample.png")
img = img.to_onebit()
img.remove_border()
segments = img.runlength_smearing()
# now you could process each segment (e.g. saving it to a file)
for seg in segments:
# do some stuff
# visualize the result
color_ccs = img.graph_color_ccs(segments)
color_ccs.save_PNG("segments.png")
The result looks reasonable to me(note that the colors only indicate the segmentation, with adjacent segments having different colors):
Answered by cdalitz on November 21, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP