Deep learning is notorious for being an energy-intensive field that sees its applications limited. But what if these models could be run with higher energy efficiency? That is a question many researchers have asked, and a new team from IBM may have found an answer.
New research being presented this week at NeurIPS (Neural Information Processing Systems — the biggest annual AI research conference) showcases a process that could soon reduce the number of bits needed to represent data in deep learning from 16 down to four without the loss of accuracy.
“In combination with previously proposed solutions for 4-bit quantization of weight and activation tensors, 4-bit training shows a non-significant loss in accuracy across application domains while enabling significant hardware acceleration (>7×over state of the art FP16 systems),” write the researchers in their abstract.
The IBM researchers undertook experiments using their novel 4-bit training for a variety of deep-learning models in such areas as computer vision, speech, and natural language processing. They found that there was effectively a limited loss of accuracy in the models’ performances while the process was more than seven times faster and seven times more energy efficient.
This innovation could therefore cut the energy costs for training deep learning by more than sevenfold and allow AI models to be trained even on devices as small as smartphones. This would significantly improve privacy as all data would be stored on local devices.
As exciting as this is, we are still a long way from 4-bit learning as the paper only simulates this type of approach. Bringing 4-bit learning to reality would require 4-bit hardware, hardware that does not yet exist.
It may, however, soon be here. Kailash Gopalakrishnan, an IBM fellow and senior manager who lead the new research, told MIT Technology Review he predicts he will have engineered 4-bit hardware in three to four years. Now that’s something to get excited about!