Data Science Asked by DGS on October 19, 2020
I am planning to detect texts from document text images like below:
GOAL:
WORK DONE:
I have tried to solve this with some scene text detection algorithms like EAST Text detector and PixelLink. But it only provides result in such a way it detects each and every word individually as below, which is obvious:
What method can help me detect blocks of texts as mentioned under GOAL.
EDIT :
I don’t want extract all texts via OCR. What I want instead is to detect texts based on their visual positional arrangement. See in the image, texts positioned together are detected as blocks. And my result should contain all the bounding box co-ordinates of all the detected text blocks.
I would approach the text block amalgamation as a clustering problem. If you define a suitable distance metric or a neighbour predicate between the individual text boxes, you could group the boxes and then determine their minimum bounding box, which is essentially what you are aiming for.
I guess DBSCAN could be a suitable candidate for the clustering algorithm, but more care would have to go into the design of the neighbor predicate - one idea could be that vertical distance could be treated differently than horizontal distance, etc.
Answered by Jan Šimbera on October 19, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP