
AI boost for image compression

Researchers in Switzerland have used a machine learning digital twin to compress image data with greater accuracy than learning-free computation methods.
The AI image compression technique developed at EPFL has applications for retinal implants and other medical electronics.
A major challenge to developing better neural prostheses is the sensory encoding: transforming information captured from the environment by sensors into neural signals that can be interpreted by the nervous system. But because the number of electrodes in a prosthesis is limited, this environmental input must be reduced in some way, while still preserving the quality of the data that is transmitted to the brain.
- Visually lossless image compression with smaller memory
- Video compression research project targets 6G networks
- AI-designed material enables super-resolution display
Demetri Psaltis in the Optics Lab at EPFL and Christophe Moser in the Laboratory of Applied Photonics Devices collaborated with Diego Ghezzi of the Hôpital ophtalmique Jules-Gonin – Fondation Asile des Aveugles (previously Medtronic Chair in Neuroengineering at EPFL) to apply machine learning to the problem of compressing image data with multiple dimensions, such as colour and contrast.
In this case the compression goal was downsampling, reducing the number of pixels of an image to be transmitted via a retinal prosthesis.
“Downsampling for retinal implants is currently done by pixel averaging, which is essentially what graphics software does when you want to reduce a file size. But at the end of the day, this is a mathematical process; there is no learning involved,” Ghezzi explains.
“We found that if we applied a learning-based approach, we got improved results in terms of optimized sensory encoding. But more surprising was that when we used an unconstrained neural network, it learned to mimic aspects of retinal processing on its own.”
The AI image compression technique is called an actor-model framework and is good at finding a sweet spot for image contrast.
In the actor-model AI framework, two neural networks work in a complementary fashion. The model portion, or forward model, acts as a digital twin of the retina: it is first trained to receive a high-resolution image and output a binary neural code that is as similar as possible to the neural code generated by a biological retina.
The actor network is then trained to downsample a high-resolution image using the forward model that is as close as possible to that produced by the biological retina in response to the original image.
“The obvious next step is to see how can we compress an image more broadly, beyond pixel reduction, so that the framework can play with multiple visual dimensions at the same time. Another possibility is to transpose this retinal model to outputs from other regions of the brain. It could even potentially be linked to other devices, like auditory or limb prostheses,” said Ghezzi.
doi.org/10.1038/s41467-024-45105-5
