Optimizing the Four-Index Integral Transform Using Data Movement Lower Bounds Analysis

Abstract : The four-index integral transform is a fundamental and com-putationally demanding calculation used in many computational chemistry suites such as NWChem. It transforms a four-dimensional tensor from one basis to another. This transformation is most efficiently implemented as a sequence of four tensor contractions that each contract a four-dimensional tensor with a two-dimensional transformation matrix. Differing degrees of permutation symmetry in the intermediate and final tensors in the sequence of contractions cause intermediate tensors to be much larger than the final tensor and limit the number of electronic states in the modeled systems. Loop fusion, in conjunction with tiling, can be very effective in reducing the total space requirement, as well as data movement. However, the large number of possible choices for loop fusion and tiling, and data/computation distribution across a parallel system, make it challenging to develop an optimized parallel implementation for the four-index integral transform. We develop a novel approach to address this problem, using lower bounds modeling of data movement complexity. We establish relationships between available aggregate physical memory in a parallel computer system and ineffective fusion configurations, enabling their pruning and consequent identification of effective choices and a characterization of optimality criteria. This work has resulted in the development of a significantly improved implementation of the four-index transform that enables higher performance and the ability to model larger electronic systems than the current implementation in the NWChem quantum chemistry software suite.
Complete list of metadatas

Contributor : Fabrice Rastello <>
Submitted on : Tuesday, December 5, 2017 - 7:35:05 PM
Last modification on : Thursday, October 11, 2018 - 8:48:05 AM



Samyam Rajbhandari, Fabrice Rastello, Karol Kowalski, Sriram Krishnamoorthy, P. Sadayappan. Optimizing the Four-Index Integral Transform Using Data Movement Lower Bounds Analysis. PPoPP 2017 - 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Feb 2017, Austin, United States. pp.327 - 340, ⟨10.1145/3018743.3018771⟩. ⟨hal-01653823⟩



Record views


Files downloads