TransWikia.com

How to filter data samples which do not improve classifier?

Data Science Asked on June 13, 2021

I have a text dataset with noisy labels and an unbalanced shape.

There are various ways to find features which do not drive improvement in some metric, and help to prune those from the pipeline.

I cannot however find a way to throw away noisy / confusing samples.

One naive approach might be to run the model removing a single sample at a time and see if the metric improves or does not change – if that happens, throw away that sample.

Is there a name for such a technique? Or does it not exist for good reason 🙂

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP