Parallel evaluation of arithmetic circuits

Nathalie Revol 1 Jean-Louis Roch 2
1 ARENAIRE - Computer arithmetic
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
2 APACHE - Parallel algorithms and load sharing
ID-IMAG - Informatique et Distribution, Inria Grenoble - Rhône-Alpes, UJF - Université Joseph Fourier - Grenoble 1
Abstract : In this paper, a generic algorithm designed for the parallel evaluation of arithmetic circuits is given. This algorithm can be used in the domain of VLSI design, in order to get tight upper bounds on the computing time of a circuit. It can also be used in automatic parallelization of numerical programs, as a guide for the detection of some predefinite schemes such as dot-products or reductions. More generally, the (theoretical) algorithm presented in Section 2 evaluates very quickly arithmetic straight-line programs, and its evaluation time serves as a good upper bound. This algorithm generalizes Miller, Ramachandran and Kaltofen's algorithm (1988) in the sense it deals with a great variety of algebraic structures: semi-rings, rings or lattices. Our contribution resides on the one hand in a new bound for the evaluation of circuits over lattices, which improves previous results (Miller and Teng, 1987), and on the other hand in the unified formulation for the evaluation algorithm. This algorithm runs in O(min(log n + log d)log n,(ha + log n)log n)) parallel time, d being the “algebraic degree” (in an extended sense) of the circuit and ha the maximal number of alternances of + and * on a path of the circuit if the + and * operations define a lattice, with M(n) processors, where M(n) is the number of processors necessary for the multiplication of two n × n matrices in the structure in O(log n) parallel time. After presenting this algorithm, its efficiency is shown on particular cases: taking as input a simple and sequential algorithm, it can be used as a “compiler” to produce a sorting circuit as fast as Cole's circuit, with logarithmic depth, or an adder equivalent to Brent and Kung's adder in terms of size and depth. These academic examples confirm the relevance of the algorithm presented here in the area of conception of fast VLSI arithmetic operators.
Type de document :
Article dans une revue
Journal of Theoretical Computer Science (TCS), Elsevier, 1996, A, 162 (1), pp.133-150. 〈10.1016/0304-3975(95)00252-9〉
Liste complète des métadonnées
Contributeur : Nathalie Revol <>
Soumis le : jeudi 9 décembre 2010 - 13:45:14
Dernière modification le : mardi 16 janvier 2018 - 15:50:54




Nathalie Revol, Jean-Louis Roch. Parallel evaluation of arithmetic circuits. Journal of Theoretical Computer Science (TCS), Elsevier, 1996, A, 162 (1), pp.133-150. 〈10.1016/0304-3975(95)00252-9〉. 〈inria-00545017〉



Consultations de la notice