Data Science Asked by n.mathfreak on June 24, 2021
I have lots of documentation in pdf, ppt, xls files, which contain tables with text, pictures, headlines etc. The goal is to extract blocks of texts where the information is continuous, so to say.
The lazy method is to copy-paste or convert the documents to txt. I suppose I can also write a Python script that semi-automates some parts. I also found online tools that detect tables, but they are better suited for tables with numbers and values.
What would be a good, faster way to do it?
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP