Geographic Information Systems Asked by tdgsysjrgv on January 5, 2021
For example, I have 100 data points and am classifying them using the Equal Count (Quantile) method.
I want to have 5 classes, and in the ideal situation I will have 100/5 = 20 values in each class. However, I have 35 data values which are 0.
Would QGIS classify all 35 zeros into one class, or split them into 2 classes at random?
I have tried for one dataset and I noticed that all the zeros are grouped together. Is that the case every time?
Couldn’t find any documentation for how they split the data for Equal Count.
I tried to read the source code at https://github.com/qgis/QGIS/blob/master/src/core/classification/qgsclassificationmethod.cpp to gain some insights. However, I see how they generated the breaks but I can’t find where they classify the values into the separate groups. Can someone tell me where and how the code works for this part?
Here's some insight: https://issues.qgis.org/issues/21451
But in short, items with the same value need to be assigned the same rank, meaning sometimes you'll get differing numbers in each quantile. Think about it this way. If you have 2 first-place teams there won't be any second-place team.
For my own use, I created a processing algorithm for this that adds and populates a ranking field. I've been meaning to add this to a GitHub repo, and just have. You can find it in its ragged glory here:
https://github.com/davidlgalt/locitools/
Disclaimer: When it comes to PyQGIS & GitHub I am still on the uphill side of the learning curve.
Correct answer by David Galt on January 5, 2021
it looks like the behavior you saw should be reproduced every time. I believe the code you found (and linked) is general and used for every type of classification which QGIS makes available in the symbology tab. It is reliant on the calculateBreaks() method which is defined separately for each type of classification.
Here is the link to the equal count (quantile) implementation of calculateBreaks(): https://github.com/qgis/QGIS/blob/master/src/core/classification/qgsclassificationquantile.cpp
The way these breaks are generated uses a static formula in the code linked above. Then, the code you found is used to assign each data point into a category between those breaks. All ties will still be between one break and another and will be sorted into the same place.
I'm not totally sure what the static formula is doing line by line or I'd give a better explanation, but it seems to be essentially following the steps described on statistics how to here: https://www.statisticshowto.com/quantile-definition-find-easy-steps/
Answered by Randcelot on January 5, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP