TransWikia.com

Binary classification of similar images with small region of interest

Data Science Asked by TasosGlrs on March 5, 2021

I have a dataset of microscope images and I want to train a ML/DL algorithm to perform binary classification. The positive class is when there is only one cell in the image, and the negative class is everything else (i.e. when there are either more than 1 cells, or no cells at all).

Below is one of the original images. (there is a cell in the curved site on the center of the image)

original image

Due to the big size of the images (2048×2048) and the excess of information (the cells can only be in the tube-system), I decided to preprocess them. So, I set everything outside the tube-system to 0 (black) and I crop all the images to the boundaries I got by averaging the images of the whole dataset. Below you can see the end result. (there are 2 cells in the tube, one in the center and one at the upper left part)

preprocessed image

Then I tried to train a CNN (using python and TensorFlow). I played a few times by changing its hyperparameters, but I had no luck. I think the problem is that the cells (region of interest) are occupying a very small portion of the image, which makes it hard for the algorithm to focus on. To make things worse, there are random dust particles around which make the image noisy.

Do you have any ideas of how I could perform a robust binary classification in such a dataset?

3 Answers

I would use a CLAHE preprocessing, SIFT Image features and mask the wrong detected keypoints out as geometric constrains. then i would count sift keypoints in a image without and keypoints to get a thredhold boundary and classify by amount of robust keypoints or logistic regression

Answered by znarf on March 5, 2021

I would still stick with using a CNN for that specific application. Think about CNNs being used to detect various types of cancer in noisy images with an insane precision (Stanford, Google). This type of input is actually very similar to yours with cancer cells hiding in a cluster of healthy ones. And yet the models are performing as good as cancer experts in some cases.

CNN have shown to work best when trained with a HUGE amount of data. If possible try to provide more training data with a decent class distribution (roughly the same number of positive and negative examples). Moreover, apart from tuning hyperparameters you could also experiment with different CNN architectures. You will fin plenty of inspiration in the litterature.

Answered by tony on March 5, 2021

I am doing similar project as yours recently. The object need to be classified is small and I am using Fine-Tuning which can helps from overfitting because I have small dataset size(1500+).

However, when I input the whole image into the network, it just doesnt work.

The explanation for this could be: CNN is a process of downsampling. When your Region of Interest(ROI) is small, it has a high chance that you will lose the information of ROI at the end of the CNN layers.

What I could suggest is you better crop the training data on the area you are interested. It can help CNN to know where to learn. When you testing, you could crop the test data before feeding it into a CNN. In this way, you will have a better chance to know how many cells are in the whole image.

I did same thing in my project. I am able to achieve $90%$ on cropped data and $80%$ on whole image. If you already figure out a better or an efficient way, please share it with me if possible.

Answered by Wenxiao Zhan on March 5, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP