On Fast Computing of Neural Networks Using Central Processing Units

A. V. Trusova,b,*, E. E. Limonovaa,b,**, D. P. Nikolaevb,c,***, and V. V. Arlazarova,b,****

a Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, Moscow, 119333 Russian Federation

b Smart Engines Service LLC, Moscow, 121205 Russian Federation

c Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, 127051 Russian Federation

Correspondence to: * e-mail: trusov.av@smartengines.com
Correspondence to: ** e-mail: limonova@smartengines.com
Correspondence to: *** e-mail: dimonstr@iitp.ru
Correspondence to: **** e-mail: vva@smartengines.com

Received 20 October, 2022

Abstract—This work is devoted to methods for creating fast and accurate neural network algorithms for central processors, which were proposed by scientists of the V.L. Arlazarov’s scientific school. It outlines general principles and approaches to improving computational efficiency and discusses specific examples: tensor convolution decompositions that simplify convolutional neural networks; bounded nonlinear activation function ratio, which is calculated faster than exponential activation functions; and p-im2col convolution algorithm, which allows you to achieve a balance between computational efficiency and RAM consumption. Particular attention is paid to quantized (8- and 4-bit integer) neural networks, their training, implementation, and limitations on some central processor architectures, such as Elbrus.

Keywords: artificial neural networks, high-performance computing, scientific school

DOI: 10.1134/S105466182304048X