I started looking at Kaggle competitions to practice my machine learning skills. One of currently running competitions is framed as an image classification problem. Intel partnered with MobileODT to start a Kaggle competition to develop an algorithm which identifies a woman’s cervix type based on images.
The training set contains 1481 images split into three types. Kagglers can use 6734 additional images. Some of them come from duplicate patients. Some of the additional images are lower quality. Test sets for two stages of the competition are available, kagglers have to submit a set of predicted probabilities, one for each of 3 classes, for each image of the test set. The total prize pool is $100,000.
I tried to approach the problem in a naïve way: just get a pre-trained Inception V3 image classification model and fine-tune it on this dataset.
Philipp Schmidt published Cervix EDA notebook: researching the basic properties of the dataset.
I loaded all labeled images and resized them to 224×224 shape, which is used in Inception V3. Shuffled and split into train and dev sets in 80/20 proportion.
Inception V3 model and weights, pre-trained on ImageNet dataset, were loaded using Keras. Top classification layer was removed, a new dense layer with dropout and a softmax layer were added on top. I froze all Inception layers and trained new dense layers first. Then last two convolutional blocks of Inception were unfrozen and I fine-tuned them as well. The model was trained on 80% of labeled data and validated on 20%.
Not great. Validation loss doesn’t go lower than 0.95. The model overfits quickly. I got 54.5% accuracy on the validation set.
My code is available here.
As you can see in discussions on Kaggle (1, 2, 3), it’s hard for a non-trained human to classify these images. See a short tutorial on how to (humanly) recognize cervix types by visoft.
Low image quality makes it harder. Another challenge is the small size of the dataset.
It looks like the best way forward is to split the problem into two: image segmentation to find a cervix in the image, and then image classification. Image segmentation problem requires manual review of training examples to find bounding boxes. Illumination correction can be another thing to try. Data augmentation (rotation, flipping) can help to increase the number of training examples.
At this step, I think I have a good feeling what it is like to work on image classification problems. This one is too hard for me to compete right now, and computer vision is not my area of focus. I’ll go ahead and check other competitions. This one looks interesting: Two Sigma Connect: Rental Listing Inquiries. It is a classification problem. The dataset includes structured data, text, and images.