Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. Based on quantum Carleman linearization and shadow tomography (QRAM is not necessary), we design the first quantum algorithm for training classical sparse neural networks with end-to-end settings. Our quantum algorithm could provide provably efficient resolutions for generic (stochastic) gradient descent in T^2 polylog(n), where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse.. We benchmark instances of training ResNet from 7 to 103 million parameters with sparse pruning applied to the Cifar-100 dataset, and we find that a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows that fault-tolerant quantum computing can contribute to the scalability and sustainability of most state-of-the-art, large-scale machine learning models.
Reference:
[1] Towards provably efficient quantum algorithms for large-scale machine learning models, arXiv:2303.03428.