Communication-Avoiding QR Decomposition for GPUs, 2011 IEEE International Parallel & Distributed Processing Symposium, pp.48-58, 2011. ,
DOI : 10.1109/IPDPS.2011.15
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par, pp.187-198, 2009. ,
URL : https://hal.archives-ouvertes.fr/inria-00384363
Communicationoptimal Parallel Algorithm for Strassen's Matrix Multiplication, Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures, pp.193-204, 2012. ,
Minimizing Communication in Numerical Linear Algebra, SIAM Journal on Matrix Analysis and Applications, vol.32, issue.3, pp.866-901, 2011. ,
DOI : 10.1137/090769156
Matrix multiplication on heterogeneous platforms, IEEE Transactions on Parallel and Distributed Systems, vol.12, issue.10, pp.1033-1051, 2001. ,
DOI : 10.1109/71.963416
URL : https://hal.archives-ouvertes.fr/hal-00808288
Partitioning a Square into Rectangles: NP-Completeness and Approximation Algorithms, Algorithmica, vol.34, issue.3, pp.217-239, 2002. ,
DOI : 10.1007/s00453-002-0962-9
URL : https://hal.archives-ouvertes.fr/hal-00807407
Comparison of Static and Dynamic Resource Allocation Strategies for Matrix Multiplication, Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp.1-10, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01163936
A New Approximation Algorithm for Matrix Partitioning in Presence of Strongly Heterogeneous Processors, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2016. ,
DOI : 10.1109/IPDPS.2016.32
URL : https://hal.archives-ouvertes.fr/hal-01216245
Towards Data Partitioning for Parallel Computing on Three Interconnected Clusters, Sixth International Symposium on Parallel and Distributed Computing (ISPDC'07), pp.39-39, 2007. ,
DOI : 10.1109/ISPDC.2007.56
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.569.4555
PaRSEC: Exploiting Heterogeneity to Enhance Scalability, Computing in Science & Engineering, vol.15, issue.6, pp.36-45, 2013. ,
DOI : 10.1109/MCSE.2013.98
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers: Design Issues and Performance, In: APCC in Physics Chemistry and Engineering Science, pp.95-106, 1995. ,
DOI : 10.1007/3-540-60902-4_12
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.1318
Hierarchical Partitioning Algorithm for Ccientific Computing on Highly Heterogeneous CPU + GPU Clusters, Euro- Par 2012 Parallel Processing, pp.489-501, 2012. ,
DOI : 10.1007/978-3-642-32820-6_49
Optimal Data Partitioning Shape for Matrix Multiplication on Three Fully Connected Heterogeneous Processors, Euro-Par 2014: Parallel Processing Workshops, pp.201-214, 2014. ,
DOI : 10.1007/978-3-319-14325-5_18
Exact and approximation algorithms for a soft rectangle packing problem, Optimization, vol.63, issue.11, pp.1637-1663, 2014. ,
DOI : 10.1109/43.920707
Communication-avoiding Krylov Subspace Methods, 2010. ,
Heterogeneous Distribution of Computations Solving Linear Algebra Problems on Networks of Heterogeneous Computers, Journal of Parallel and Distributed Computing, vol.61, issue.4, pp.520-535, 2001. ,
DOI : 10.1006/jpdc.2000.1686
DDOps: dual-direction operations for load balancing on non-dedicated heterogeneous distributed systems, Cluster Computing, vol.16, issue.1, pp.503-528, 2014. ,
DOI : 10.1007/s10586-013-0294-3
An approximation algorithm for dissecting a rectangle into rectangles with specified areas, Discrete Applied Mathematics, vol.155, issue.4, pp.523-537, 2007. ,
DOI : 10.1016/j.dam.2006.08.005
URL : http://doi.org/10.1016/j.dam.2006.08.005
Hierarchical Task-Based Programming With StarSs, International Journal of High Performance Computing Applications, vol.23, issue.3, pp.284-299, 2009. ,
DOI : 10.1177/1094342009106195
URL : http://hdl.handle.net/2117/28379
On optimization of finite-difference time-domain (FDTD) computation on heterogeneous and GPU clusters, Journal of Parallel and Distributed Computing, vol.71, issue.4, pp.584-593, 2011. ,
DOI : 10.1016/j.jpdc.2010.10.011
Communication-optimal Parallel 2.5 D Matrix Multiplication and LU factorization Algorithms, Euro-Par 2011 Parallel Processing, pp.90-109, 2011. ,
Rectangles as sums of squares, Discrete Mathematics, vol.309, issue.9, pp.2913-2921, 2009. ,
DOI : 10.1016/j.disc.2008.07.028