Data Science Asked by Sundeep Pidugu on January 21, 2021
I have a use case where I am supposed to get the coordinates of each block element in a page (whether its paragraph, image, table) where I train a model to understand how they are placed in a given page where some documents are with good layout and other with bad ones and I want to train this and throw in some coordinates of a new document and try to understand whether it has a good layout or a bad layout, I want to understand how I can achieve this using some deep learning techniques ?
can someone suggest me an approach for solving this?
Was trying to workout with RNN but not sure if that’s the correct approach.
When you're talking about layout, I am assuming you are talking about the way elements are arranged in the page. In that case, dividing your page into grids and having each element represented into grid numbers should solve your problem. Let's assume your page is divided into 9 x 9 grids, then you can have this data frame as an input. Grids can be replaced with your co-ordinates. You can add in more features like font-size, font-style, colour, etc.,
number header1 header2 header3 paragraph1 font-size-h1 font-size-h2 target
1 (2,3) 4 (4,2) 3 12 8 0
The tuple represents the co-ordinates and the individual number represents the grid number. The font-size can be added for each element which will increase the dimensions of the dataset. May be try running a simpler algorithm like Random Forest first and then move onto deep learning techniques. Sometimes, simpler algorithms work better than complex algorithms.
Aesthetic Value of a Webpage, Layout Classification
Answered by Danny on January 21, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP