K. Agrawal, J. T. Fineman, J. Krage, C. E. Leiserson, and S. Toledo, Cache-conscious 760 scheduling of streaming applications, Proc. Twenty-fourth Annual ACM Symposium on 761 Parallelism in Algorithms and Architectures, SPAA '12, pp.762-236, 2012.

T. N. Bui and C. Jones, A heuristic for reducing fill-in in sparse matrix factorization, Proc. 764 6th SIAM Conf. Parallel Processing for Scientific Computing, SIAM, pp.445-452, 1993.

Ü. V. and C. Aykanat, PaToH: A Multilevel Hypergraph Partitioning Tool, p.766

T. F. Coleman and W. Xu, Parallelism in structured Newton computations, in Parallel 769 Computing: Architectures, Algorithms and Applications, pp.295-302, 2007.

T. F. Coleman and W. Xu, Fast (structured) Newton computations, SIAM Journal on Scien-772 tific Computing, vol.31, pp.1175-1191, 2009.

T. F. Coleman and W. Xu, Automatic Differentiation in MATLAB using ADMAT with 774 Applications, 2016.

J. Cong, Z. Li, and R. Bagrodia, Acyclic multi-way partitioning of Boolean networks, 776 Proceedings of the 31st Annual Design Automation Conference, DAC'94, pp.670-675, 1994.

T. A. Davis and Y. Hu, The University of Florida sparse matrix collection, ACM Trans. Math

. Herrmann,

. Softw, , vol.38, p.25, 2011.

E. D. Dolan and J. J. Moré, Benchmarking optimization software with performance profiles, 781 Mathematical programming, vol.91, pp.201-213, 2002.

V. Elango, F. Rastello, L. Pouchet, J. Ramanujam, and P. Sadayappan, On charac-783 terizing the data access complexity of programs, SIGPLAN Not, vol.50, pp.567-580, 2015.

N. Fauzia, V. Elango, M. Ravishankar, J. Ramanujam, F. Rastello et al.,

P. Pouchet and . Sadayappan, Beyond reuse distance analysis: Dynamic analysis for 786 characterization of data locality potential, ACM Trans. Archit. Code Optim, vol.10, p.29, 2013.

C. M. Fiduccia and R. M. Mattheyses, A linear-time heuristic for improving network par-789 titions, pp.175-181, 1982.

M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of 791 NP-Completeness, 1979.

B. Hendrickson and R. Leland, The Chaco user's guide, version 1.0, 1993.

J. Herrmann, J. Kho, B. Uçar, K. Kaya, and .. V. , Acyclic partitioning of 795 large directed acyclic graphs, Proceedings of the 17th IEEE/ACM International Sym-796 posium on Cluster, Cloud and Grid Computing, vol.797, pp.371-380, 2017.

G. Karypis and V. Kumar, MeTiS: A Software Package for Partitioning Unstructured, p.799

, Graphs, Partitioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices 800 Version 4.0, 1998.

B. W. Kernighan, Optimal sequential partitions of graphs, J. ACM, vol.18, pp.34-40, 1971.

B. W. Kernighan and S. Lin, An efficient heuristic procedure for partitioning graphs, The 804 Bell System Technical Journal, vol.49, pp.291-307, 1970.

M. R. Kristensen, S. A. Lund, T. Blum, and J. Avery, Fusion of parallel array 806 operations, Proceedings of the 2016 International Conference on Parallel Architectures 807 and Compilation, pp.71-85, 2016.

M. R. Kristensen, S. A. Lund, T. Blum, K. Skovhede, and B. Vinter, Bohrium: A 809 virtual machine approach to portable parallelism, Proceedings of the 2014 IEEE Inter-810 national Parallel & Distributed Processing Symposium Workshops, IPDPSW '14, Wash-811 ington, pp.312-321, 2014.

O. Moreira, M. Popp, and C. Schulz, Graph partitioning with acyclicity constraints, 16th 813 International Symposium on Experimental Algorithms, SEA, 2017.

O. Moreira, M. Popp, and C. Schulz, Evolutionary multi-level acyclic graph partitioning, 816 Proceedings of the Genetic and Evolutionary Computation Conference, p.817

. Japan, , pp.332-339, 2018.

J. Nossack and E. Pesch, A branch-and-bound algorithm for the acyclic partitioning problem, p.819

, Computers & Operations Research, vol.41, pp.174-184, 2014.

C. H. Papadimitriou and K. Steiglitz, Corrected, unabirdged reprint of Combinatorial Optimization: 822 Algorithms and Complexity, p.823, 1982.

F. Pellegrini, SCOTCH 5.1 User's Guide, Laboratoire Bordelais de Recherche en Informa-825 tique (LaBRI), 2008.
URL : https://hal.archives-ouvertes.fr/hal-00410327

L. Pouchet, Polybench: The polyhedral benchmark suite, 2012.

P. Sanders and C. Schulz, Engineering multilevel graph partitioning algorithms, Algo-829 rithms -ESA 2011: 19th Annual European Symposium, pp.469-480, 2011.

C. Walshaw, Multilevel refinement for combinatorial optimisation problems, Annals of Oper-833 ations Research, vol.131, pp.325-372, 2004.

E. S. Wong, E. F. Young, and W. K. Mak, Clustering based acyclic multi-way parti-835 tioning, Proceedings of the 13th ACM Great Lakes Symposium on VLSI, GLSVLSI '03, 836, pp.203-206, 2003.

. Graph-k-moreira,