Pattern Recognition and Image Analysis. Vol. 33, No. 4, Pages 756-768, 2023

A. V. Trusov^a^,^b^,*, E. E. Limonova^a^,^b^,**, D. P. Nikolaev^b^,^c^,***, and V. V. Arlazarov^a^,^b^,****

^aFederal Research Center “Computer Science and Control” of the Russian Academy of Sciences, Moscow, 119333 Russian Federation

^c Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, 127051 Russian Federation

Correspondence to: * e-mail: trusov.av@smartengines.com
Correspondence to: ** e-mail: limonova@smartengines.com
Correspondence to: *** e-mail: dimonstr@iitp.ru
Correspondence to: **** e-mail: vva@smartengines.com

Abstract—This work is devoted to methods for creating fast and accurate neural network algorithms for central processors, which were proposed by scientists of the V.L. Arlazarov’s scientific school. It outlines general principles and approaches to improving computational efficiency and discusses specific examples: tensor convolution decompositions that simplify convolutional neural networks; bounded nonlinear activation function ratio, which is calculated faster than exponential activation functions; and p-im2col convolution algorithm, which allows you to achieve a balance between computational efficiency and RAM consumption. Particular attention is paid to quantized (8- and 4-bit integer) neural networks, their training, implementation, and limitations on some central processor architectures, such as Elbrus.

Keywords: artificial neural networks, high-performance computing, scientific school

On Fast Computing of Neural Networks Using Central Processing Units