TransWikia.com

Sklearn LocalOutlierFactor contamination parameter

Data Science Asked by sandyp on September 4, 2021

Can anyone provide an intuitive explanation of the choice of contamination parameter used in sklearn’s LocalOutlierFactor implementation when contamination="auto" ?

The sklearn guide suggests “as described in the paper” but I couldn’t find anything obvious. Thanks.

2 Answers

(this answer assumes you were asking about how the offset_ attribute was chosen when contamination="auto")

The only place in the paper that I can conceive of that factor coming from is Section 7.3, where the original authors explored soccer data and say

Below we discuss all the local outliers with LOF > 1.5 (see table 3), and explain why they are exceptional.

Answered by Tom M. on September 4, 2021

You are specifying with a floating point number what proportion of the data you are fitting on is an outlier. If you use 'Auto' it will default to 0.1. Note that in the current documentation, there is a changed note specifying that it will default to 0.2 in a future version.

Answered by StevenTheDataGuy on September 4, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP