'Small Data' Are Also Crucial for Machine Learning

Tue, 19 Oct 2021 05:00:00 GMT
Scientific American - Technology

The most promising AI approach you’ve never heard of doesn’t need to go big

AI is not only about large data sets, and research in "Small data" approaches has grown extensively over the past decade-with so-called transfer learning as an especially promising example.

Also known as "Fine-tuning," transfer learning is helpful in settings where you have little data on the task of interest but abundant data on a related problem.

A research team working on German-language speech recognition showed that they could improve their results by starting with an English-language speech model trained on a larger data set before using transfer learning to adjust that model for a smaller data set of German-language audio.

In a new report for Georgetown University's Center for Security and Emerging Technology, we examined current and projected progress in scientific research across "Small data" approaches, broken down in terms of five rough categories: transfer learning, data labeling, artificial data generation, Bayesian methods and reinforcement learning.

Using a three-year growth forecast model, our analysis estimates that research on transfer learning methods will grow the fastest through 2023 among the small data categories we considered.

Small data approaches such as transfer learning offer numerous advantages over more data-intensive methods.

Because transfer learning models work by transferring knowledge from one task to another, they are very helpful in improving generalization in the new task, even if only limited data were available.

AI experts such as Andrew Ng have emphasized the significance of transfer learning and have even stated that the approach will be the next driver of machine learning success in industry.

While many machine learning experts and data scientists are likely familiar with it at this point, the existence of techniques such as transfer learning does not seem to have reached the awareness of the broader space of policy makers and business leaders in positions of making important decisions about AI funding and adoption.

By acknowledging the success of small data techniques like transfer learning-and allocating resources to support their widespread use-we can help overcome some of the pervasive misconceptions regarding the role of data in AI and foster innovation in new directions.

Summarized by 65%, original article size 2239 characters