TransWikia.com

How to deal with annotation errors?

Data Science Asked by Edamame on December 11, 2020

I know my annotators are not perfect, sometimes making mistakes. What would be the best way to deal with the annotation errors for my training data? Thanks!

One Answer

It's very common to have some amount of errors or inconsistencies in a dataset. Sometimes these inconsistencies are not even errors, in some subjective tasks (e.g. translation), annotators may simply not agree on what is the best answer.

What to do with this kind of noise completely depends on the case at hand. If the noise caused by these errors represents a reasonably small proportion of the data, it can safely be ignored: in this case it's up to the learning algorithm to distinguish the relevant patterns from the noise. Otherwise there can be ad-hoc pre-processing implemented to clean up the data. In cases where the subjectivity of the annotator plays an important role, it's useful to have several annotators annotated the same data and check the inter-annotator agreement. This might in turn be used to filter out the least consensual instances, or aggregate the annotations in some way (e.g. majority voting).

Answered by Erwan on December 11, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP