Which is the best algorithm for entity extraction for unstructured document

Question

I have unstructured documents from which I have to extract the information like 
let buyer name, seller name, expiry date, buying date etc. I had planned to use spacy(Custom entity recolonization(Followed this blog https://medium.com/@manivannan_data/how-to-train-ner-with-custom-training-data-using-spacy-188e0e508c6)). But it seems sometimes buyer name predict as seller name and vice-versa and also sometimes got multiple predicted data wrongly in single entity when I passed whole document content. FYI.. This documents have approx 2-20 pages. so it has large content.

Can someone share if we can use any other packages for higher accuracy? if not how I need to train the model so that accuracy will be higher? Thanks in advance

user87451 · Answer

Try to clean your document and use the flair library, it's a user friendly library from Zalando Research that allows you do do all sorts of nlp tasks very quickly. Especially NER.

Which is the best algorithm for entity extraction for unstructured document

One Answer

Add your own answers!

Ask a Question