Posted by Yang Song, Staff Software Engineer and Serge Belongie, Visiting Faculty, Google Research
Thanks to recent advances in deep learning, the visual recognition abilities of machines have improved dramatically, permitting the practical application of computer vision to tasks ranging from pedestrian detection for self-driving cars to expression recognition in virtual reality. One area that remains challenging for computers, however, is fine-grained and instance-level recognition. Earlier this month, we posted an instance-level landmark recognition challenge for identifying individual landmarks. Here we focus on fine-grained visual recognition, which is to distinguish species of animals and plants, car and motorcycle models, architectural styles, etc. For computers, discriminating fine-grained categories is challenging because many categories have relatively few training examples (i.e., the long tail problem), the examples that do exist often lack authoritative training labels, and there is variability in illumination, viewing angle and object occlusion.
To help confront these hurdles, we are excited to announce the 2018 iNaturalist Challenge (iNat-2018), a species classification competition offered in partnership with iNaturalist and Visipedia (short for Visual Encyclopedia), a project for which Caltech and Cornell Tech received a Google Focused Research Award. This is a flagship challenge for the 5th International Workshop on Fine Grained Visual Categorization (FGVC5) at CVPR in 2018. Building upon the first iNaturalist challenge, iNat-2017, iNat-2018 spans over 8000 categories of plants, animals, and fungi, with a total of more than 450,000 training images. We invite participants to enter the competition on Kaggle, with final submissions due in early June. Training data, annotations, and links to pretrained models can be found on our GitHub repo.
iNaturalist has emerged as a world leader for citizen scientists to share observations of species and connect with nature since its founding in 2008. It hosts research-grade photos and annotations submitted by a thriving, engaged community of users. Consider the following photo from iNaturalist:
|The map on the right shows where the photo was taken. Image credit: Serge Belongie.|
You may notice that the photo on the left contains a turtle. But did you also know this is a Trachemys scripta, common name “Pond Slider?” If you knew the latter, you possess knowledge of fine-grained or subordinate categories.
In contrast to other image classification datasets such as ImageNet, the dataset in the iNaturalist challenge exhibits a long-tailed distribution, with many species having relatively few images. It is important to enable machine learning models to handle categories in the long-tail, as the natural world is heavily imbalanced – some species are more abundant and easier to photograph than others. The iNaturalist challenge will encourage progress because the training distribution of iNat-2018 has an even longer tail than iNat-2017.
|Distribution of training images per species for iNat-2017 and iNat-2018, plotted on a log-log scale, illustrating the long-tail behavior typical of fine-grained classification problems. Image Credit: Grant Van Horn and Oisin Mac Aodha.|
Along with iNat-2018, FGVC5 will also host the iMaterialist 2018 challenge (including a furniture categorization challenge and a fashion attributes challenge for product images) and a set of “FGVCx” challenges representing smaller scale – but still significant – challenges, featuring content such as food and modern art.
FGVC5 will be showcased on the main stage at CVPR 2018, thereby ensuring broad exposure for the top performing teams. This project will advance the state-of-the-art in automatic image classification for real world, fine-grained categories, with heavy class imbalances, and large numbers of classes. We cordially invite you to participate in these competitions and help move the field forward!
We’d like to thank our colleagues and friends at iNaturalist, Visipedia, and FGVC5 for working together to advance this important area. At Google we would like to thank Hartwig Adam, Weijun Wang, Nathan Frey, Andrew Howard, Alessandro Fin, Yuning Chai, Xiao Zhang, Jack Sim, Yuan Li, Grant Van Horn, Yin Cui, Chen Sun, Yanan Qian, Grace Vesom, Tanya Birch, Wendy Kan, and Maggie Demkin.