🏆 How did Thomas Friedel win the Mosquito Alert Challenge 2023

Sneha Nanavati
AIcrowd
Published in
4 min readJan 8, 2024

--

Thomas Friedel’s journey into AI and machine learning began in academia in the mid-2000s, with a focus on computer vision. His work at Iris GmbH sharpened his skills in neural networks. Leading Plantix’s machine learning team since 2018, he found a new intersection in AI and agronomy. The Mosquito Identification Challenge, combining data science and public health impact, caught his attention.

The Mosquito Identification Challenge
Mosquitoes, carriers of diseases like Zika and Dengue, are a global health concern. The challenge aimed to improve the process of identifying these insects, making it faster, more efficient, and accurate. Controlling mosquito populations is vital to prevent disease outbreaks and protect communities worldwide.

The competition centered around developing cutting-edge AI solutions that precisely identify mosquitoes within images captured by citizen contributors using their mobile devices. The dataset provided was diverse, featuring real images contributed by citizens showcasing mosquitoes in various contexts, encompassing different body positions, sizes, and lighting conditions. Participants needed to construct robust models that handle the dataset’s unbalanced distribution of mosquito classes, ensuring accurate detection and classification across all categories.

The Task at Hand
Thomas and his peers were tasked with developing AI solutions to identify mosquitoes from diverse images. These images, from citizen scientists, showed mosquitoes in various poses and settings. Thomas’s expertise in machine learning and computer vision came into play, in a challenge that blended citizen science with the power of AI.

Thomas Friedel

Overcoming the Odds
In the Mosquito Identification Challenge, Thomas Friedel encountered a few challenges. While he meaningfully addressed the class imbalance, other dataset issues included:

  • Inconsistent Labeling: Not every mosquito was enclosed in a bounding box, sometimes leading to multiple mosquitoes within a single box.
  • Variable Box Tightness: The varying tightness of bounding boxes around mosquitoes made it challenging to meet the 0.75 IoU threshold for all labeled objects.
  • EXIF Orientation Tag: Some images were incorrectly marked as rotated, causing misalignment with corresponding bounding boxes.
  • Class Imbalance: Minority classes like “aegypti” had significantly fewer images, about 100 times less than majority classes.

Crafting the Winning Strategy
Thomas’s primary goal in the image classification phase was to find a model that balanced accuracy with CPU-based inference speed. After evaluating various models from the TIMM training pipeline and monitoring results with Weights & Biases, he chose ConvNeXt V2-Base for its balance of high accuracy and efficient CPU performance.

Pretraining
Instead of using pretraining model weights from Imagenet, Thomas trained the model on related INaturalist data. He trained with an image size of 224x224 on approximately 8 million images from iNaturalist, progressively increasing the image size to 400x400. He used the Prodigy optimizer, which achieves similar accuracy to AdamW without needing hyperparameter optimization. Thomas then fine-tuned the model on a subset of the INaturalist dataset with images of flies.

Mosquito Image Classification Model
To construct a robust mosquito image dataset, Thomas processed images from Mosquito Alert, cropping them based on given bounding boxes with a 20-pixel padding. He addressed class imbalance by oversampling minority classes. For mosquito images from INaturalist without bounding boxes, he used the trained object detection model to generate them.

In the model training phase, Thomas explored various models with different image augmentation settings. He combined the top-performing models through a technique called “model soup,” which involves averaging the weights of the models to enhance performance without additional computational costs. The image augmentation strategy included default TIMM training pipeline augmentations and additional techniques like cutmix, mixup, random erasing, label smoothing, and scaling.

Challenges and Learnings
Thomas’s experience in the Mosquito Challenge was marked by experimentation and learning. Attempts to improve results through strategies like dataset cleaning with Cleanlab, higher resolution, Yolov8, test-time augmentation, and increased padding were not significantly successful. He realized the importance of a good test set and feedback from the public leaderboard, despite the risk of overfitting. Thomas also learned that noisy feedback signals could be unreliable for further improvements.

Reflecting on the Journey
Thomas Friedel’s success in the Mosquito Identification Challenge highlights AI’s role in public health. He navigated technical challenges and learned about data quality and method limitations. This experience underlines a key lesson in AI research: balancing innovation and practicality.

As we continue to unveil new challenges with real-world impact like the Commonsense Persona-Grounded Dialogue Challenge 2023 1, we invite you to join us. This challenge is a playground for testing and expanding your skills in natural conversation understanding using AI. Join us in this exciting exploration of AI possibilities, where your contributions can help shape the future of technology.

--

--

📚 books 🎥 movies ⚽️ football 📻 21st-century internet hedonist ✏️ garden variety Thoreau making this my Walden 👇https://www.instagram.com/_bubblegumfactory_/