Data Science Asked by SanMelkote on August 2, 2020
What is the current state-of-the-art for pos tagging and named entity recognition for twitter data? Are industrial-strength programs like Spacy
and SparkNLP
accurate for such texts? How about FlairNLP
and Stanford’s CoreNLP
accuracy measures?
SOTA is changing so rapidly in NLP that even Data Science professionists struggle to cope with it. I have two main sources that I constantly check to gain some insights on SOTA:
NLP Progress from Sebastian Ruder. It contains updates on NLP on a whole lot of subfields, NER and POST included.
Paper with code contains a section on NLP. That's a great website for ML in general.
I know these links do not tackle the problem of Twitter specifically, however I don't think that domain is qualitatively different from others. IMO, of course.
About your other question:
Are industrial-strength programs like Spacy and SparkNLP accurate for such texts? How about FlairNLP and Stanford's CoreNLP accuracy measures?
As I wrote above, it's mostly a matter of personal preference and/or contingent project needs. There's no right or wrong tool. Personally, I found Stanford tools to be the best, for either the quality of their predictions and the amount of models available from a single pipeline. But as I said it's very subjective.
Answered by Leevo on August 2, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP