Projects


A4: Design-Time Characterisation and Analysis of Invasive Algorithmic Patterns

Principal Investigators:

Prof. M. Glaß, Prof. M. Bader

Scientific Researchers:

A. Pöppl, T. Schwarzer, Dr. S. Wildermann B. Pourmohseni

Abstract

The project investigates existing, develops novel, and analyses invasive algorithmic patterns wrt. qualities of execution to exploit the resource awareness of invasive computing. The research focuses on (a) stencil computations and non-regular tree traversals as invasive algorithmic patterns and (b) design-time characterisation techniques for the derivation of sets of optimised and diverse operating points by considering symmetries and system services of a given heterogeneous invasive manycore architecture.

Synopsis

The management and exploitation of dynamically-adapting and heterogeneous resources substantially increases the complexity of invasive algorithm design and invasive scheduling decisions: not just a certain amount of resources, but the best suitable combination of computation and communication resources needs to be selected and invaded at run time. To enforce a predictable quality of execution, applications need to carefully define their quality requirements and provide a precise characterisation of achievable qualities, which depend on the available resources (on a given manycore platform). Moreover, exposing requirement definitions and scheduling strategies to application developers opens new opportunities for the design of invasive algorithms in the sense of a co-design of algorithms, hardware, and resource management.
This project therefore investigates existing, develops novel, and characterises invasive algorithmic patterns and addresses the following main research questions: (a) How to derive and pass knowledge about the algorithms' and respective applications' performance characteristics on heterogeneous resources to a run-time system? (b) How to support online scheduling decisions based on this knowledge to improve (and/or enforce) quality requirements and system constraints? (c) How to develop invasion-aware algorithms such that they benefit from invasion on heterogeneous platforms?

Approach


Project A4 aims at contributing to the entire invasion chain by developing invasive algorithmic patterns, by a design-time characterisation of these patterns on available platforms, by derivation of sets of optimised operating points, and finally by evaluating the performance gain on concrete invasive platforms. As depicted in the above figure, the core of this project's research focuses on the interplay of algorithm development and invasive scheduling and mapping decisions that shall be guided by application characteristics automatically determined at design time. This characterisation that delivers which quality numbers may be achieved by a certain claim assembly and mapping (specified by claim constraints) is also indispensable for program execution at predictable quality. The two main research areas of A4 are thus: (a) Novel invasion-driven algorithmic patterns tailored to maximise performance and/or significantly enhance predictability of corresponding applications. (b) Automatic multi-objective characterisation of applications on concrete (heterogeneous tiled) architectures to guide the claim assembling performed by the invasive run-time system, particularly to enforce the predictability of application execution quality.

Publications

[1] Jan Spieck, Stefan Wildermann, and Jürgen Teich. A learning-based methodology for scenario-aware mapping of soft real-time applications onto heterogeneous mpsocs. ACM Trans. Design Autom. Electr. Syst., 28(1):4:1–4:40, 2023. [ DOI | http ]
[2] Yakup Budanaz, Mario Wille, and Michael Bader. Asynchronous workload balancing through persistent work-stealing and offloading for a distributed actor model library. In 2022 IEEE/ACM Parallel Applications Workshop: Alternatives To MPI+X (PAW-ATM), pages 39–51, November 14, 2022. [ DOI ]
[3] Jan Spieck, Stefan Wildermann, and Jürgen Teich. On Transferring Application Mapping Knowledge Between Differing MPSoC Architectures. In CODES+ISSS 2022, October 2022.
[4] Nidhi Anantharajaiah, Tamim Asfour, Michael Bader, Lars Bauer, Jürgen Becker, Simon Bischof, Marcel Brand, Hans-Joachim Bungartz, Christian Eichler, Khalil Esper, Joachim Falk, Nael Fasfous, Felix Freiling, Andreas Fried, Michael Gerndt, Michael Glaß, Jeferson Gonzalez, Frank Hannig, Christian Heidorn, Jörg Henkel, Andreas Herkersdorf, Benedict Herzog, Jophin John, Timo Hönig, Felix Hundhausen, Heba Khdr, Tobias Langer, Oliver Lenke, Fabian Lesniak, Alexander Lindermayr, Alexandra Listl, Sebastian Maier, Nicole Megow, Marcel Mettler, Daniel Müller-Gritschneder, Hassan Nassar, Fabian Paus, Alexander Pöppl, Behnaz Pourmohseni, Jonas Rabenstein, Phillip Raffeck, Martin Rapp, Santiago Narváez Rivas, Mark Sagi, Franziska Schirrmacher, Ulf Schlichtmann, Florian Schmaus, Wolfgang Schröder-Preikschat, Tobias Schwarzer, Mohammed Bakr Sikal, Bertrand Simon, Gregor Snelting, Jan Spieck, Akshay Srivatsa, Walter Stechele, Jürgen Teich, Furkan Turan, Isaías A. Comprés Ureña, Ingrid Verbauwhede, Dominik Walter, Thomas Wild, Stefan Wildermann, Mario Wille, Michael Witterauf, and Li Zhang. Invasive Computing. FAU University Press, August 16, 2022. [ DOI ]
[5] Jan Spieck, Stefan Wildermann, and Jürgen Teich. A Learning-Based Methodology for Scenario-Aware Mapping of Soft Real-Time Applications onto Heterogeneous MPSoCs. ACM Transactions on Design Automation of Electronic Systems, 2022. [ DOI ]
[6] Behnaz Pourmohseni, Stefan Wildermann, Fedor Smirnov, Paul Meyer, and Jürgen Teich. Task Migration Policy for Thermal-Aware Dynamic Performance Optimization in Many-Core Systems. IEEE Access, 2022. [ DOI ]
[7] Jan Spieck, Stefan Wildermann, and Jürgen Teich. Domain-Adaptive Soft Real-Time Hybrid Application Mapping for MPSoCs. In 2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD), September 2021.
[8] Behnaz Pourmohseni. System-Level Mapping, Analysis, and Management of Real-Time Applications in Many-Core Systems. Dissertation, Hardware/Software Co-Design, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany, 2021.
[9] Martin Bogusz, Philipp Samfass, Alexander Pöppl, Jannis Klinkenberg, and Michael Bader. Evaluation of multiple hpc parallelization frameworks in a shallow water proxy application with multi-rate local time stepping. In The 3rd Annual Parallel Applications Workshop, Alternatives To MPI+X, Atlanta, Georgia, USA (virtual), November 2020. IEEE/ACM sigARCH.
With a widening gap between processor speed and communication latencies, overlap of computation and communication becomes increasingly important. At the same time, variable CPU clock frequencies (e.g., with DVFS and Turbo Boost) and novel numerical techniques such as local time stepping make it challenging to balance parallel execution times, even in the case of balanced computational load. This limits parallel efficiency. In order to tackle these challenges, emerging runtime systems may be used. In this paper, we present a thorough study of four selected parallelization frameworks – Chameleon, HPX, Charm++ and UPC++ – in a proxy application for solving the shallow water equations. In addition, we augment the traditional MPI baseline variant with support for these frameworks and evaluate them in detail with respect to strong scaling efficiency and load balancing for global and local time stepping.
[10] Jan Spieck, Stefan Wildermann, and Jürgen Teich. Run-Time Scenario-Based MPSoC Mapping Reconfiguration Using Machine Learning Models. In 2019 ACM/IEEE 1st Workshop on Machine Learning for CAD (MLCAD), July 2020. [ DOI ]
[11] Tobias Schwarzer. System-level Mapping of Dataflow Applications onto MPSoCs. Dissertation, Hardware/Software Co-Design, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany, 2020.
[12] Jürgen Teich, Behnaz Pourmohseni, Oliver Keszöcze, Jan Spieck, and Stefan Wildermann. Run-Time Enforcement of Non-Functional Application Requirements in Heterogeneous Many-Core Systems. In Proceedings of the 25th Asia and South Pacific Design Automation Conference (ASP-DAC), pages 629–636, January 2020. [ DOI ]
[13] Behnaz Pourmohseni, Fedor Smirnov, Stefan Wildermann, and Jürgen Teich. Real-Time Task Migration for Dynamic Resource Management in Many-Core Systems. In Proceedings of the Workshop on Next Generation Real-Time Embedded Systems (NG-RES), pages 5:1–5:14, 2020. [ DOI ]
[14] Behnaz Pourmohseni. System-level mapping, analysis, and management of real-time applications in many-core systems. Ph.D. Forum at the Design, Automation and Test in Europe Conference (DATE). Ph.D. Forum Best Poster Award, 2020.
[15] Behnaz Pourmohseni, Michael Glaß, Jörg Henkel, Heba Khdr, Martin Rapp, Valentina Richthammer, Tobias Schwarzer, Fedor Smirnov, Jan Spieck, Jürgen Teich, Andreas Weichslgartner, and Stefan Wildermann. Hybrid Application Mapping for Composable Many-Core Systems: Overview and Future Perspective. Journal of Low Power Electronics and Applications, 10:1–37, 2020. [ DOI ]
[16] Jürgen Teich, Pouya Mahmoody, Behnaz Pourmohseni, Sascha Roloff, Wolfgang Schröder-Preikschat, and Stefan Wildermann. Run-Time Enforcement of Non-functional Program Properties on MPSoCs. In Jian-Jia Chen, editor, A Journey of Embedded and Cyber-Physical Systems, pages 125–149. Springer, 2020. [ DOI ]
[17] J. Spieck, S. Wildermann, and J. Teich. Scenario-Based Soft Real-Time Hybrid Application Mapping for MPSoCs. In 2020 57th ACM/IEEE Design Automation Conference (DAC), pages 1–6, 2020. [ DOI ]
[18] Alexander Pöppl, Scott Baden, and Michael Bader. A UPC++ actor library and its evaluation on a shallow water proxy application. In Parallel Applications Workshop, Alternatives To MPI+X, Denver, Colorado, United States of America, November 2019. IEEE, IEEE.
[19] Jan Spieck, Stefan Wildermann, Tobias Schwarzer, Jürgen Teich, and Michael Glaß. Data-Driven Scenario-based Application Mapping for Heterogeneous Many-Core Systems. In Multicore/Many-core Systems-on-Chip (MCSoC 2019), pages 334–341, October 2019. [ DOI ]
[20] Tobias Schwarzer, Joachim Falk, Simone Müller, Martin Letras, Christian Heidorn, Stefan Wildermann, and Jürgen Teich. Compilation of dataflow applications for multi-cores using adaptive multi-objective optimization. ACM Transactions on Design Automation of Electronic Systems, 24(3):29:1–29:23, March 2019. [ DOI ]
[21] Behnaz Pourmohseni, Fedor Smirnov, Heba Khdr, Stefan Wildermann, Jürgen Teich, and Jörg Henkel. Thermally composable hybrid application mapping for real-time applications in heterogeneous many-core systems. In Proceedings of the 40th IEEE Real-Time Systems Symposium (RTSS), 2019. [ DOI ]
[22] Behnaz Pourmohseni, Fedor Smirnov, Stefan Wildermann, and Jürgen Teich. Isolation-aware timing analysis and design space exploration for predictable and composable many-core systems. In Proceedings of the 31th Euromicro Conference on Real-Time Systems (ECRTS 2019), pages 12:1–12:24, 2019. [ DOI ]
[23] Behnaz Pourmohseni, Stefan Wildermann, Michael Glaß, and Jürgen Teich. Hard Real-Time Application Mapping Reconfiguration for NoC-Based Many-Core Systems. Real-Time Systems, 55(2):433–469, 2019. [ DOI ]
[24] Dirk Gabriel, Walter Stechele, and Stefan Wildermann. Resource-aware parameter tuning for real-time applications. In Martin Schoeberl, Christian Hochberger, Sascha Uhrig, Jürgen Brehm, and Thilo Pionteck, editors, Architecture of Computing Systems – ARCS 2019, pages 45–55. Springer International Publishing, 2019. [ DOI ]
[25] Jörg Henkel, Jürgen Teich, Stefan Wildermann, and Hussam Amrouch. Dynamic resource management for heterogeneous many-cores. In Proceedings of the International Conference on Computer-Aided Design (ICCAD), pages 60:1–60:6. ACM, November 2018. [ DOI ]
[26] Andreas Weichslgartner, Stefan Wildermann, Deepak Gangadharan, Michael Glaß, and Jürgen Teich. A design-time/run-time application mapping methodology for predictable execution time in mpsocs. ACM Transactions on Embedded Computing Systems (TECS), 17(5):89:1–89:25, November 2018. [ DOI ]
[27] Tobias Schwarzer, Sascha Roloff, Valentina Richthammer, Rami Khaldi, Stefan Wildermann, Michael Glaß, and Jürgen Teich. On the Complexity of Mapping Feasibility in Many-Core Architectures. In Proceedings of Multicore/Many-core Systems-on-Chip (MCSoC-2018), September 2018.
[28] Valentina Richthammer, Tobias Schwarzer, Stefan Wildermann, Jürgen Teich, and Michael Glaß. Architecture Decomposition in System Synthesis of Heterogeneous Many-Core Systems. In 55th ACM/EDAC/IEEE Design Automation Conference (DAC 2018), June 2018.
[29] Tobias Schwarzer, Andreas Weichslgartner, Michael Glaß, Stefan Wildermann, Peter Brand, and Jürgen Teich. Symmetry-eliminating Design Space Exploration for Hybrid Application Mapping on Many-Core Architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 37(2):297–310, February 2018. [ DOI ]
[30] Andreas Weichslgartner, Stefan Wildermann, Michael Glaß, and Jürgen Teich. Invasive Computing for Mapping Parallel Programs to Many-Core Architectures. Springer, January 15, 2018. [ DOI ]
[31] Alexander Pöppl, Marvin Damschen, Florian Schmaus, Andreas Fried, Manuel Mohr, Matthias Blankertz, Lars Bauer, Jörg Henkel, Wolfgang Schröder-Preikschat, and Michael Bader. Shallow water waves on a deep technology stack: Accelerating a finite volume tsunami model using reconfigurable hardware in invasive computing. In Dora B. Heras, Luc Bougé, Gabriele Mencagli, Emmanuel Jeannot, Rizos Sakellariou, Rosa M. Badia, Jorge G. Barbosa, Laura Ricci, Stephen L. Scott, Stefan Lankes, and Josef Weidendorfer, editors, Euro-Par 2017: Proceedings of the 10th Workshop on UnConventional High Performance Computing (UCHPC 2017), Lecture Notes in Computer Science (LNCS), pages 676–687, Cham, 2018. Springer International Publishing.
[32] Michael Glaß, Jürgen Teich, Martin Lukasiewycz, and Felix Reimann. Hybrid optimization techniques for system-level design space exploration. In Soonhoi Ha and Jürgen Teich, editors, Handbook of Hardware/Software Codesign, volume 1, pages 217–246. Springer, Dordrecht, The Netherlands, September 2017. [ DOI ]
[33] Hananeh Aliee. Reliability Analysis and Optimization of Embedded Systems using Stochastic Logic and Importance Measures. Dissertation, Hardware/Software Co-Design, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany, March 15, 2017.
[34] Behnaz Pourmohseni, Michael Glaß, and Jürgen Teich. Automatic Operating Point Distillation for Hybrid Mapping Methodologies. In Proceedings of Design, Automation and Test in Europe Conference Exhibition (DATE), pages 1135–1140. IEEE, 2017. [ DOI ]
[35] Behnaz Pourmohseni, Stefan Wildermann, Michael Glaß, and Jürgen Teich. Predictable run-time mapping reconfiguration for real-time applications on many-core systems. In Proceedings of the International Conference on Real-Time Networks and Systems (RTNS). IEEE, 2017. Outstanding paper award. [ DOI ]
[36] Jürgen Teich. Invasive computing – editorial. it – Information Technology, 58(6):263–265, November 24, 2016. [ DOI ]
[37] Alexander Pöppl, Michael Bader, Tobias Schwarzer, and Michael Glaß. SWE-X10: Simulating shallow water waves with lazy activation of patches using ActorX10. In Proceedings of the 2nd International Workshop on Extreme Scale Programming Models and Middleware (ESPM2), pages 32–39. IEEE, November 2016. [ .pdf ]
[38] Stefan Wildermann, Michael Bader, Lars Bauer, Marvin Damschen, Dirk Gabriel, Michael Gerndt, Michael Glaß, Jörg Henkel, Johny Paul, Alexander Pöppl, Sascha Roloff, Tobias Schwarzer, Gregor Snelting, Walter Stechele, Jürgen Teich, Andreas Weichslgartner, and Andreas Zwinkau. Invasive computing for timing-predictable stream processing on MPSoCs. it – Information Technology, 58(6):267–280, September 30, 2016. [ DOI ]
[39] Jürgen Teich, Michael Glaß, Sascha Roloff, Wolfgang Schröder-Preikschat, Gregor Snelting, Andreas Weichslgartner, and Stefan Wildermann. Language and compilation of parallel programs for *-predictable MPSoC execution using invasive computing. In Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), pages 313–320, Lyon, France, September 2016. [ DOI ]
[40] Sascha Roloff, Alexander Pöppl, Tobias Schwarzer, Stefan Wildermann, Michael Bader, Michael Glaß, Frank Hannig, and Jürgen Teich. ActorX10: An actor library for X10. In Proceedings of the 6th ACM SIGPLAN X10 Workshop (X10), pages 24–29. ACM, June 14, 2016. [ DOI ]
[41] Andreas Weichslgartner, Stefan Wildermann, Johannes Götzfried, Felix Freiling, Michael Glaß, and Jürgen Teich. Design-time/run-time mapping of security-critical applications in heterogeneous mpsocs. In Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems (SCOPES), pages 153–162. ACM, May 23, 2016. [ DOI ]
[42] Alexander Pöppl and Michael Bader. Swe-x10: An actor-based and locally coordinated solver for the shallow water equations. In Proceedings of the 6th ACM SIGPLAN Workshop on X10, X10 2016, pages 30–31. ACM, 2016. [ DOI ]
Keywords: APGAS, Actor Model, Parallel Algorithms, Shallow Water Equations
[43] Alexander Pöppl and Alexander Herz. A cache-aware performance prediction framework for gpgpu computations. In Sascha Hunold, Jesper Larsson Träff, and Francesco Versaci, editors, Euro-Par 2015: Parallel Processing Workshops, Lecture Notes in Computer Science, Berlin, Heidelberg, 2016. Springer-Verlag.
[44] Hans Michael Gerndt, Michael Glaß, Sri Parameswaran, and Barry L. Rountree. Dark Silicon: From Embedded to HPC Systems (Dagstuhl Seminar 16052). Dagstuhl Reports, 6(1):224–244, 2016. [ DOI ]
[45] Michael Bader. Performance optimzation vs. power - experiences with petascale earthquake simulations on supermuc. Talk at the Workshop on Power-Bounded HPC Performance Optimisation at Schloss Dagstuhl, August 2015.
[46] Alexander Breuer, Alexander Heinecke, Leonhard Rannabauer, and Michael Bader. High-order ader-dg minimizes energy- and time-to-solution of seissol. In High Performance Computing, 30th International Conference, ISC High Performance 2015, Frankfurt, Germany, July 12-16, 2015, Proceedings, volume 9137 of Lecture Notes in Computer Science, pages 340–357. Springer, July 2015.
[47] Michael Bader. Vectorization of riemann solvers for the shallow water equations. Minisymposium Flooding the Cores - Computing Flooding Events with Many-Core and Accelerator Technologies at SIAM Conference on Computational Science and Engineering, March 2015.
[48] Sebastian Graf, Felix Reimann, Michael Glaß, and Jürgen Teich. Towards scalable symbolic routing for multi-objective networked embedded system design and optimization. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS 2014), pages 2:1–2:10, October 12, 2014. [ DOI ]
[49] Andreas Weichslgartner, Deepak Gangadharan, Stefan Wildermann, Michael Glaß, and Jürgen Teich. Daarm: Design-time application analysis and run-time mapping for predictable execution in many-core systems. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS 2014), pages 1–10, October 2014. [ DOI ]
[50] Tobias Weinzierl, Michael Bader, Kristof Unterweger, and Roland Wittmann. Block fusion on dynamically adaptive spacetree grids for shallow water waves. Parallel Processing Letters, 24(3):1441006, September 2014.
[51] Michael Glaß, Michael Bader, Jürgen Teich, and Stefan Wildermann. Assisting run-time optimization of many-core systems by design-time characterization. HiPEAC Spring Computing Systems Week, Barcelona, Invited Talk, May 13, 2014.
[52] Stefan Wildermann, Michael Glaß, and Jürgen Teich. Multi-objective distributed run-time resource management for many-cores. In Proceedings of Design, Automation and Test in Europe (DATE), pages 1–6, March 2014. [ DOI ]
[53] Michael Bader. On the performance of adaptive mesh-based simulations on modern hpc architectures. Invited presentation at SIAM Conference on Parallel Processing in Scientific Computing - SIAM PP 2014, February 2014. [ .pdf ]
[54] Michael Bader. Exploiting locality properties of space-filling curves in scientific computing. Colloquium talk at TU Eindhoven, November 2013.