Training the document page layout and classifying good/bad layouts

Question

I have a use case where I am supposed to get the coordinates of each block element in a page (whether its paragraph, image, table) where I train a model to understand how they are placed in a given page where some documents are with good layout and other with bad ones and I want to train this and throw in some coordinates of a new document and try to understand whether it has a good layout or a bad layout, I want to understand how I can achieve this using some deep learning techniques ?

can someone suggest me an approach for solving this?

Was trying to workout with RNN but not sure if that’s the correct approach.

cnn deep learning machine learning nlp rnn

can someone suggest me an approach for solving this?

Was trying to workout with RNN but not sure if that's the correct approach.

Danny · Answer

When you're talking about layout, I am assuming you are talking about the way elements are arranged in the page. In that case, dividing your page into grids and having each element represented into grid numbers should solve your problem. Let's assume your page is divided into 9 x 9 grids, then you can have this data frame as an input. Grids can be replaced with your co-ordinates. You can add in more features like font-size, font-style, colour, etc.,

number header1 header2 header3 paragraph1 font-size-h1 font-size-h2 target
1      (2,3)      4     (4,2)     3           12           8           0

The tuple represents the co-ordinates and the individual number represents the grid number. The font-size can be added for each element which will increase  the dimensions of the dataset. May be try running a simpler algorithm like Random Forest first and then move onto deep learning techniques. Sometimes, simpler algorithms work better than complex algorithms.

Aesthetic Value of a Webpage, Layout Classification

Training the document page layout and classifying good/bad layouts

One Answer

Add your own answers!

Ask a Question