How to detect blocks of texts in document images

Question

I am planning to detect texts from document text images like below:

GOAL:

WORK DONE:
I have tried to solve this with some scene text detection algorithms like EAST Text detector and PixelLink. But it only provides result in such a way it detects each and every word individually as below, which is obvious:

What method can help me detect blocks of texts as mentioned under GOAL.

EDIT :

I don't want extract all texts via OCR. What I want instead is to detect texts based on their visual positional arrangement. See in the image, texts positioned together are detected as blocks. And my result should contain all the bounding box co-ordinates of all the detected text blocks.

Jan Šimbera · Answer

I would approach the text block amalgamation as a clustering problem. If you define a suitable distance metric or a neighbour predicate between the individual text boxes, you could group the boxes and then determine their minimum bounding box, which is essentially what you are aiming for.

I guess DBSCAN could be a suitable candidate for the clustering algorithm, but more care would have to go into the design of the neighbor predicate - one idea could be that vertical distance could be treated differently than horizontal distance, etc.

How to detect blocks of texts in document images

One Answer

Add your own answers!

Ask a Question