Twitter POS and NER: What is state-of-the-art?

Question

What is the current state-of-the-art for pos tagging and named entity recognition for twitter data? Are industrial-strength programs like Spacy and SparkNLP accurate for such texts? How about FlairNLP and Stanford's CoreNLP accuracy measures?

Leevo · Answer

SOTA is changing so rapidly in NLP that even Data Science professionists struggle to cope with it. I have two main sources that I constantly check to gain some insights on SOTA:

NLP Progress from Sebastian Ruder. It contains updates on NLP on a whole lot of subfields, NER and POST included.

Paper with code contains a section on NLP. That's a great website for ML in general.

I know these links do not tackle the problem of Twitter specifically, however I don't think that domain is qualitatively different from others. IMO, of course.

About your other question:

Are industrial-strength programs like Spacy and SparkNLP accurate for such texts? How about FlairNLP and Stanford's CoreNLP accuracy measures?

As I wrote above, it's mostly a matter of personal preference and/or contingent project needs. There's no right or wrong tool. Personally, I found Stanford tools to be the best, for either the quality of their predictions and the amount of models available from a single pipeline. But as I said it's very subjective.

Twitter POS and NER: What is state-of-the-art?

One Answer

Add your own answers!

Ask a Question