In a study published this week in the journal Nature Communications, researchers at IBM’s lab in Zurich, Switzerland claim to have developed a technique that achieves both energy efficiency and high accuracy on machine learning workloads using phase-change memory. By exploiting in-memory computing methods using resistance-based storage devices, their approach marries the compartments used to store and compute data, in the process significantly cutting down on active power consumption.
Many existing AI inferencing setups physically split the memory and processing units, causing AI models to be stored in off-chip memory. This adds computational overhead because data must be shuffled between the units, a process that slows down processing and contributes to electrical usage. IBM’s technique ostensibly solves those problems with phase-change memory, a form of non-volatile memory that’s faster than the commonly used flash memory technology. The work, if proven scalable, could pave the way for powerful hardware that runs AI in drones, robots, mobile devices, and other compute-constrained devices.
As the IBM team explains, the challenge with phase-change memory devices is that it tends to introduce computational inaccuracy. That’s because it’s analog in nature; its precision is limited due to variability as well as read and write conductance noise.
The solution the study proposed entails injecting additional noise during the training of AI models in software to improve the models’ resilience. The results suggest it’s successful — training a ResNet model with noise achieved accuracy of 93.7% on the popular CIFAR-19 data set and top-1 accuracy on ImageNet of 71.6% after mapping the trained weights (i.e., parameters that transform input data) to phase-change memory components. Moreover, after mapping the weights of a particular model onto 723,444 phase-change memory devices in a prototype chip, the accuracy stayed above 92.6% over the course of a single day. The researchers claim it’s a record.
Register for the free livestream.
In an attempt to further improve accuracy retention over time, the coauthors of the study also developed a compensation technique that periodically corrects the activation functions (equations that determine the model’s output) during inference. This led to an improvement in accuracy to 93.5% on hardware, they say.
In parallel, the team experimented with training machine learning models using analog phase-change memory components. With a mixed-precision architecture, they report that they managed to attain “software-equivalent” accuracies on several types of small-scale models, including multilayer perceptrons, convolutional neural networks, long-short-term-memory networks, and generative adversarial networks. The training experiments are detailed in full in a study recently published in the journal Frontiers in Neuroscience.
IBM’s latest work in the domain follows the introduction of the company’s phase-change memory chip for AI training. While still in the research stage, company researchers demonstrated the system could store weight data as electrical charges, performing 100 times more calculations per square millimeter than a graphics card while using 280 times less power.
“In an era transitioning more and more towards AI-based technologies, including internet-of-things battery-powered devices and autonomous vehicles, such technologies would highly benefit from fast, low-powered, and reliably accurate DNN inference engines,” IBM said in a statement. “The strategies developed in our studies show great potential towards realizing accurate AI hardware-accelerator architectures to support DNN training and inferencing in an energy-efficient manner.”