Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Improving Flexibility and Performance in Metric-Based Few-Shot Classification

Improving Flexibility and Performance in Metric-Based Few-Shot Classification

File(s)
Wertheimer_cornellgrad_0058F_12935.pdf (24.54 MB)
Permanent Link(s)
https://doi.org/10.7298/vykb-a332
https://hdl.handle.net/1813/111812
Collections
Cornell Theses and Dissertations
Author
Wertheimer, Davis
Abstract

Neural image classifiers have surpassed human performance and attained widespread usage, but rely crucially on access to hundreds if not thousands of labeled images for each category of interest. This assumed high level of image availability is not always realistic. Classes of interest might be rare, or require expensive expert annotation. These concerns have fueled interest in few-shot classification, or the ability to classify novel object types using very few (only one to five) reference images per class. Performance on few-shot classification benchmarks has since seen steady improvement and the field remains a healthy area of research. Unfortunately, the standard few-shot classification benchmarks also exhibit unrealistic assumptions. It is frequently assumed that salient objects are nicely centered and cropped, and that classes are visually distinct. At deployment, it is assumed that only five relevant classes will be present at any given time, that exactly one or five reference images will be available per class, and that the practitioner will always know which one of these it will be in advance. In real world conditions, any or all of these assumptions may break; practical few-shot classification will require sufficient power and flexibility to handle these scenarios. In this work, we show that existing few-shot classifiers underperform when these assumptions are broken. However, through three novel techniques, we can restore and further improve model performance. We provide proof of concept using a novel few-shot classification benchmark more closely reflecting real-world conditions, then investigate each technique in further detail. First, we incorporate inexpensive location annotations during training, to better isolate regions of interest when relevant objects are not scaled, centered or viewer-oriented. Second, we leverage relationships among and between image components during classification to produce more powerful classifiers when classes are visually similar. Third, we reformulate the few-shot training process to handle greater levels of reference image availability; this facilitates a large scale-study on the effect of availability upon classifier performance. We find that a simple alteration to built-in distance metrics restores consistent performance when reference image availability does not match training conditions: we no longer need to know the degree of availability in advance. Together, these findings improve the flexibility and power of few-shot classifiers, and establish a valuable starting point for deployment in messy, real-world conditions.

Description
204 pages
Date Issued
2022-05
Keywords
Deep Learning
•
Few-Shot
•
Image Classification
•
Image Recognition
•
Machine Learning
Committee Chair
Hariharan, Bharath
Committee Member
Joachims, Thorsten
Bala, Kavita
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Rights
Attribution-ShareAlike 4.0 International
Rights URI
https://creativecommons.org/licenses/by-sa/4.0/
Type
dissertation or thesis
Link(s) to Catalog Record
https://newcatalog.library.cornell.edu/catalog/15530001

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance