
For millions of people, chatbots powered by large language models (LLMs) are now a key feature of everyday life. These AI systems are growing at a rapid pace, but scaling them up is becoming increasingly costly and resource-intensive.
Through a new preprint on the arXiv server, a team led by Borja Aizpurua at Multiverse Computing in San Sebastián, Spain, has found a way to improve the performance of LLMs using quantum computing. Their approach could offer a smarter alternative, rather than simply throwing more hardware at the problem.
The parameter problem
LLMs like those powering ChatGPT and Claude work by learning enormous numbers of adjustable parameters that together determine how the model processes and generates text. The more parameters a model has, the better it tends to perform.
But each parameter requires physical memory to store, and as these models grow larger, the memory demands grow with them in a way that becomes increasingly difficult and expensive to manage. GPT-5.5, for instance, is estimated to contain somewhere between two and five trillion parameters.
Turning to quantum circuits
To tackle these constraints, the Multiverse Computing team turned to quantum computing. Rather than adding vast numbers of new classical parameters, they inserted small quantum circuit blocks into the inner workings of a pre-trained LLM.
Because these quantum blocks can encode complex mathematical relationships in a highly compact form, they can achieve what would otherwise require many more conventional parameters. The resulting system is a hybrid: the original LLM runs on a standard computer, while the quantum components are executed on IBM’s 156-qubit superconducting quantum processor.
Boosting performance
When Aizpurua’s team applied this approach to Llama 3.1 8B, an eight-billion-parameter model developed by Meta, they achieved a 1.4% reduction in perplexity (a key measure of how reliably a model predicts the next word in a sequence), while adding just 6,000 extra parameters. For context, that represents an increase of less than one ten-thousandth of a percent.
The team also tested their platform on SmolLM2, a smaller 135-million-parameter model chosen because it was more tractable to study systematically. Here, they found that performance improved consistently as the size of the quantum components increased, and that the quantum-enhanced model was able to correctly answer questions that two purely classical versions of the same model got wrong.
Preparing for future processors
For now, the researchers acknowledge that the performance gains are modest, and are limited by the capabilities of current quantum hardware. But in demonstrating that quantum enhancement can work at all on a real, widely used model, their results are already promising.
As quantum processors become more powerful and reliable, the team believes the improvements will scale accordingly—possibly opening a fundamentally new path for developing more capable AI without the runaway infrastructure costs that threaten to define the field’s future.
Written for you by our author Sam Jarman, edited by Gaby Clark, and fact-checked and reviewed by Robert Egan—this article is the result of careful human work. We rely on readers like you to keep independent science journalism alive.
If this reporting matters to you, please consider a donation (especially monthly). You’ll get an ad-free account as a thank-you.
Publication details
Borja Aizpurua et al, Quantum-enhanced Large Language Models on Quantum Hardware via Cayley Unitary Adapters, arXiv (2026). DOI: 10.48550/arxiv.2605.05914
Journal information:
arXiv
© 2026 Science X Network
Citation:
Quantum circuits help AI overcome memory limitations with minimal new parameters (2026, June 7)
retrieved 7 June 2026
from https://phys.org/news/2026-06-quantum-circuits-ai-memory-limitations.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.




