TransWikia.com

What does MiniBatchKMeans fit's reassignment_ratio parameter do exactly?

Data Science Asked on March 2, 2021

I am using scikit-learn MiniBatchKMeans to do text clustering.

In the constructor method, there is a parameter reassignment_ratio, which is described in the documentation (link above) as follows:

reassignment_ratio : float, default=0.01

Control the fraction of the
maximum number of counts for a center to be reassigned. A higher value
means that low count centers are more easily reassigned, which means
that the model will take longer to converge, but should converge in a
better clustering.

I cannot wrap my head around that.

If I raise the reassignment ratio, I raise the "maximum number of counts for a center to be reassigned", so a center will be reassigned only if the number of samples (does "counts" stands for "sample" here?) around it is above this threshold.

Shouldn’t it be the other way around? That a center is reassigned if the number of samples around it is below reassignment ratio?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP