Mixing Low-Precision Formats in Multiply-Accumulate Units for DNN Training

Mariko Tatsumi; Silviu-Ioan Filip; Caroline White; Olivier Sentieys; Guy Lemieux

Communication Dans Un Congrès Année : 2022

Mixing Low-Precision Formats in Multiply-Accumulate Units for DNN Training

(1) , (2) , (1) , (2) , (1)

1
2

Mariko Tatsumi

Fonction : Auteur

University of British Columbia

Silviu-Ioan Filip

Fonction : Auteur
PersonId : 17918
IdHAL : silviu-ioan-filip
ORCID : 0000-0001-9278-7645

Architectures matérielles spécialisées pour l’ère post loi-de-Moore

Caroline White

Fonction : Auteur

University of British Columbia

Olivier Sentieys

Fonction : Auteur
PersonId : 6775
IdHAL : olivier-sentieys
ORCID : 0000-0003-4334-6418
IdRef : 061585920

Architectures matérielles spécialisées pour l’ère post loi-de-Moore

Guy Lemieux

Fonction : Auteur

University of British Columbia

Résumé

The most compute-intensive stage of deep neural network (DNN) training is matrix multiplication where the multiply-accumulate (MAC) operator is key. To reduce training costs, we consider using low-precision arithmetic for MAC operations. While low-precision training has been investigated in prior work, the focus has been on reducing the number of bits in weights or activations without compromising accuracy. In contrast, the focus in this paper is on implementation details beyond weight or activation width that affect area and accuracy. In particular, we investigate the impact of fixed-versus floating-point representations, multiplier rounding, and floatingpoint exceptional value support. Results suggest that (1) lowprecision floating-point is more area-effective than fixed-point for multiplication, (2) standard IEEE-754 rules for subnormals, NaNs, and intermediate rounding serve little to no value in terms of accuracy but contribute significantly to area, (3) lowprecision MACs require an adaptive loss-scaling step during training to compensate for limited representation range, and (4) fixed-point is more area-effective for accumulation, but the cost of format conversion and downstream logic can swamp the savings. Finally, we note that future work should investigate accumulation structures beyond the MAC level to achieve further gains.

Domaines

Systèmes embarqués Arithmétique des ordinateurs Architectures Matérielles [cs.AR] Traitement du signal et de l'image [eess.SP]

Fichier principal

fpt_2022.pdf (569.78 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Olivier Sentieys : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03885471

Soumis le : lundi 5 décembre 2022-17:23:48

Dernière modification le : jeudi 14 septembre 2023-03:10:03

Archivage à long terme le : lundi 6 mars 2023-19:12:34

Dates et versions

hal-03885471 , version 1 (05-12-2022)

Licence

Paternité

Identifiants

HAL Id : hal-03885471 , version 1

Citer

Mariko Tatsumi, Silviu-Ioan Filip, Caroline White, Olivier Sentieys, Guy Lemieux. Mixing Low-Precision Formats in Multiply-Accumulate Units for DNN Training. FPT 2022 - IEEE International Conference on Field Programmable Technology, Dec 2022, Hong Kong, Hong Kong SAR China. pp.1-9. ⟨hal-03885471⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA CENTRALESUPELEC INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

67 Consultations

470 Téléchargements

Mixing Low-Precision Formats in Multiply-Accumulate Units for DNN Training

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager