In a paper scheduled to be presented at the upcoming International Conference on Learning Representations, Amazon researchers propose an AI approach that greatly improves performance on certain meta-learning tasks (i.e., tasks that involve both accomplishing related goals and learning how to learn to perform them). They say it can be adapted to new tasks with only a handful of labeled training examples, meaning a large corporation could use it to, for example, extract charts and captions from scanned paperwork.
In conventional machine learning, a model trains on a set of labeled data (a support set) and learns to correlate features with the labels. It’s then fed a separate set of test data (a query set) and evaluated based on how well it predicts that set’s labels. By contrast, during meta learning, an AI model learns to perform tasks with their own sets of training data and test data and the model sees both. In this way, the AI learns how particular ways of responding to the training data affect performance on the test data.
During a second stage called meta testing, the model is trained on tasks that are related but not identical to the tasks it saw during meta learning. For each task, the model once again sees both training and test data, but the labels are unknown and must be predicted; the model can access only the support set labels.
The researchers’ technique doesn’t learn a single global model during meta training. Instead, it trains an auxiliary model to generate a local model for each task, drawing on the corresponding support set. Moreover, during meta training, it preps an auxiliary network to leverage the unlabeled data of the query sets. And during meta testing, it uses the query sets to fine-tune the aforementioned local models.
In experiments, the team reports that their system beat 16 baselines on the task of one-shot learning. In point of fact, it improved performance on one-shot learning, or learning a new object classification task from only a single labeled example — by 11% to 16%, depending on the architectures of the underlying AI models.
That said, several baselines outperformed the model on five-shot learning, or learning with five examples per new task. But the researchers say those baselines were complementary to their approach, and they believe combining approaches could yield lower error rates.
“In the past decade, deep-learning systems have proven remarkably successful at many artificial intelligence tasks, but their applications tend to be narrow,” wrote Alexa Shopping applied scientist Pablo Garcia in a blog post explaining the work. “Meta learning [can] turn machine learning systems into generalists … The idea is that it could then be adapted to new tasks with only a handful of labeled training examples, drastically reducing the need for labor-intensive data annotation.”
The paper’s publication follows a study by Google AI; the University of California, Berkely; and the University of Toronto proposing a benchmark for training and evaluating large-scale, diverse, and “more realistic” meta-learning models. The Meta-Dataset leverages data from 10 corpora that span a variety of visual concepts — natural and human-made — and vary in the specificity of the class definition.
Author: Kyle Wiggers.
Source: Venturebeat