B5: Invasive NoCs – Autonomous, Self-Optimising Communication Infrastructures for MPSoCs
Principal Investigators:
Prof. J. Becker, Prof. A. Herkersdorf, Prof. J. Teich
Scientific Researchers:
N. Anantharajaiah, L. Masing, B. Pourmohensi, S. Rheindt, A. Srivatsa,
Abstract
Project B5 investigates and designs the invasible Networks-on-a-Chip, in the following called iNoC.
In the first funding phase, a NoC architecture consisting of hardware modules and protocols was developed and implemented. This iNoC provides the possibility to invade communication resources. Through invasion of communication links, certain Quality-of-Service (QoS) guarantees can be given in terms of upper bounds for bandwidth and latency. Besides this elementary invasion capability, various decentralised and self-optimising mechanisms were investigated. Self-embedding, for instance, enables the decentralised hardware-assisted mapping of communication topologies. Rerouting helps to reduce congestion through remapping of connections and through monitoring of the traffic inside the network adapter links can be invaded autonomously (Auto-GS).
In the second funding phase the research will focus on increasing the predictability of the execution properties of concurrent applications. Here, the hardware support inside the iNoC for mapping actor-oriented task models to a NoC is needed. Tighter bounds for communication latency shall be feasible through novel circuit switching concepts. Additionally, other non-functional properties such as security, fault tolerance, and energy consumption will be investigated. Finally, novel NoC topologies (i.e. 3D NoCs) and cache coherence are topics of investigation.
Synopsis
In the first funding phase, general principles of the invasion of NoC resources were investigated and implemented. The iNoC realises the communication infrastructure for invasive architectures and can already give guarantees for throughput and worst-case latency. In Phase II, predictability will, as a main topic, be even more in focus of our collaborative research. Here, Project B5 will investigate circuit-switched networks, which can offer precise bounds on communication latency. Also, the invasive actor-oriented programming model, which will be introduced by Project A1, will be supported by the iNoC. Moreover, other non-functional properties including fault tolerance and security are addressed by Project B5 in the second funding phase. Here, the main goal is to ensure such properties through different layers and levels. Security issues, for example, do not only concern the hardware or the software part of a system. Only in close cooperation with Project C5 will this goal be reached. The same holds for fault tolerance. To ensure a certain robustness against faults, the communication infrastructure as well as computing resources like TCPA and RISC tiles Project B2 , Project B3, Project B4) are involved to find a holistic solution.
Predictability is the topic of one of the four working groups in our CRC/Transregio. Yet Project B5 will contribute as well with research topics to the other three working groups. In the research area of power efficiency and dark silicon , e.g. the iNoC will be enhanced by shutdown mechanisms and low-power techniques. This makes it possible to power down a whole chip region including network resources and processing elements without affecting the communication of the other regions and without preventing global communication. The investigations of region-based cache coherence complements the working group of memory hierarchy. In the first funding phase, coherency was supported only within a single tile. However applications tend to require more processing and/or memory resources than are available within a tile. Commonly, explicit software messages (MPI) which require additional programming effort are used for inter-tile communication. We believe that these alterations can be avoided by enabling hardware supported inter-tile cache coherence. Global coherence does not scale in a cost efficient manner and is not even required for applications with limited degrees of parallelism. Therefore, the goal here is to create flexible cache coherent regions beyond tile borders whilst limiting hardware overheads.
Approach
The iNoC connects the tiles in the heterogeneous tile based MPSoC architecture.
Through the invasive Network Adapter (iNA) the different tiles (TCPA-tiles, iCore-tiles, RISC-tiles, memory-tiles and I/O-tiles)
are connected to the iNoC.
The iNoC provides
efficient support for reservation and release of communication resources.
The reservation of communication resources by applications in the iNoC is realized through
dynamically configurable guaranteed service connections.
We investigated protocols and routers that support dynamic link invasion. This also provides the basis for autonomous mapping of communication graphs. The developed router, network adapter, and the protocols support different types of communication that cooperatively share router's resources. As topology for the invasive architecture, a mesh topology was chosen, thus most routers have five input and five output ports (routers at the borders or corners have four and three ports, respectively), which are connected through a crossbar. To increase the NoC's throughput, Virtual Channels (VC) are the state-of-the-art approach to multiplex the physical link. Therefore, every VC has a buffer at every input port and all parts of one flow are directed to the same input buffer. In addition to the data links, dedicated wires between the routers signal in which VC buffer the data should be stored. To prevent the buffers from overflowing a credit-based flow control can be used. Hard QoS guarantees can be given by reserving Guaranteed Service (GS) communication channels while Best Effort (BE) communication channels can be used where no guarantees are required without the need of any channel reservation. Thereby, BE channels may utilise resources that are not occupied by GS at that moment. To realise this, we proposed a weighted round-robin assignment of VCs to local time slots (TS). In this respect, it is possible to specify the needed bandwidth guarantee by a Service Level (SL). In following Figure, an example with concurrent data transmission is shown: C1 is an established GS connection using SL 3, thus, in each router the VC used by the connection is assigned to 3 time slots. Different time slots can be used in each router. C2 is a GS connection using SL 1.
As part of the overall decentralised resource management, we investigated the feasibility of decentralised embedding of communication topologies called self-embedding within the iNoC. The main idea of this approach is that each process embeds its own succeeding tasks with data dependencies, only considering the local view of neighbouring nodes in the NoC. This can be done in parallel in contrast to a centralised approach which has to map the tasks sequentially. Especially for streaming applications and applications with a huge amount of periodical communication data, not only the mapping of the functionality, but also of the communication onto the NoC channels is important. We presented the formal model for algorithms and showed that the decentralised mapping of tree-shaped applications and reservation of bandwidth can compete with central approaches in terms of average network utilisation, but offering a better scalability. The self-embedding functionality is implemented as a hardware component as a part of the iNoC router and communicates with other self-embedding components in other routers over a control network.
Efficient collection of monitor dataMonitor data, which are provided by Project B4, play a crucial role in resource-aware computing. In order to efficiently access the hardware status inside regions, which are only defined during run time, we investigated several mechanisms. Three mechanisms are shown schematically in the following Figure. Each example shows a region of a tile-based architecture.
Mechanism (a) represents a method to collect data within the given region using only point-to-point communication. A separate request is transmitted to each node. The nodes respond by sending the requested information. This mechanism is named Request-Response. Mechanism (b) shows a technique we call Round-Trip. A Hamiltonian cycle through all nodes of the region is used to transfer the request as well as the requested data of the addressed nodes. Mechanism (c) represents a combination of the previous two strategies. Hence, it is called Mixed. The request is transmitted on a Hamiltonian path and the response is transferred directly from each node of the region to the requesting node using unicast communication enabled, e.g. by xy-routing. Compared to Request-Response , the mixed mechanism reduces the NoC utilisation by using one Hamiltonian path for the requests instead of multiple request packets to each node. The results show that the choice of the strategy depends on different parameters such as region size and data set size. Round-Trip and Mixed outperform the straight forward implementation called Request-Response for most cases.
Invasive network adapterA modular approach is followed to design the network adapter. It consists of a tile interface, the FIFO and the iNoC interface layer. The modular approach shall simplify the connection of the different tile types of our heterogeneous architectures. Only the tile interface layer needs to be replaced depending on the tile type. Inside the network adapter, a mechanism called Auto-GS was researched to automatically detect communication partners which are addressed frequently. After a detection, the network adapter then automatically can setup a GS connection to such nodes to improve the performance of the communication and reduce the energy consumption for communication.
Region-based cache coherence (RBCC) is a concept where a subset of tiles in a large MPSoC system can be grouped into a coherency region. This approach limits the size of data structures in the directories significantly compared to traditional schemes by only tracking tiles within a region, making it scalable for large MPSoCs. The figure below illustrates the normalised savings that can be achieved with RBCC for varying tile counts (N) and region sizes (Mmax).
We introduced a programmable coherency region manager (CRM) in every tile. The CRM allows to dynamically setup and remove variably sized coherency regions and administers a coherent view for all tiles within a region. Based on the required regions, every CRM is fed with information that enables a tile to be aware of its valid sharers. The CRM is coupled with a reduced directory structure for book-keeping. To verify our RBCC concept, we developed a highly configurable SystemC-based simulation model. Our experiments demonstrated that operating multiple compute tiles in a coherency region can increase performance by a factor of up to 2.5 compared to a single tile structure with nominally identical computing and memory resources. For hardware evaluation, we have developed and integrated a prototype of the CRM module onto the invasive computing FPGA-based target platform. We plan to run traditional shared memory programming benchmarks spanning multiple tiles enabled with region-based coherence support. We plan to analyse the performance of region-based coherence by varying the size and spatial location of the regions, running multiple applications on overlapping regions, etc. Finally we plan to evaluate the performance of applications running with region-based coherence to that of existing software-(MPI) based schemes
Second-Layer NetworkManufacturing defects and ageing effects are expected to cause permanent faults in future highly integrated technology nodes which are targeted by NoC-based architectures. To deal with permanent faults, the errors need to be localised first. Afterwards, redundant hardware units or routing paths must exist to circumvent the broken nodes and links. In an effort to tackle this challenge, we employ software techniques for the fault detection in an effort to minimise hardware overhead. The concept is based on systematic flooding of the iNoC, or a region of it, by the use of special test packets. These packets are analysed after reception to determine the location of the fault or error. By using a synchronised test pattern generation, the pair of communicating nodes for error localisation is known to the test software. Due to the knowledge about the expected test data, erroneous sections of the iNoC can be identified. Once a fault is detected and located, fault treatment techniques need to be employed. We introduced a lightweight second-layer network (SLN) which takes over the duties of defective routers. This second layer acts as a transparent bypass to circumvent faulty routers. The bypass is activated on demand to form a ring around a defective router. The full implementation of the SLN is shown in figure below.
It can be noted, that the routers in the corner of the mesh can also be bridged with a full ring around them. A configuration for bypassing a single defective router is shown in figure, illustrating the circular flow of packets on the ring. The infrastructure for the second layer consists of switches to enable the injection and ejection of flits from and to the primary network, as well as multiplexers to connect the switches to form the bypass on demand. The evaluation of the SLN shows that 51% of all faults in the complex iNoC design can be detected and localised. Additional analysis exhibited that 100% of link errors can be detected and localised. Depending on the design-time configuration of the second layer network, it can be used to bypass a faulty router with a negligible impact on performance at the cost of an area overhead of only up to 18%.
A comprehensive summary of the major achievements of the first funding phase can be found by accessing Project B5 first phase website.
Publications
[1] | Nidhi Anantharajaiah, Tamim Asfour, Michael Bader, Lars Bauer, Jürgen Becker, Simon Bischof, Marcel Brand, Hans-Joachim Bungartz, Christian Eichler, Khalil Esper, Joachim Falk, Nael Fasfous, Felix Freiling, Andreas Fried, Michael Gerndt, Michael Glaß, Jeferson Gonzalez, Frank Hannig, Christian Heidorn, Jörg Henkel, Andreas Herkersdorf, Benedict Herzog, Jophin John, Timo Hönig, Felix Hundhausen, Heba Khdr, Tobias Langer, Oliver Lenke, Fabian Lesniak, Alexander Lindermayr, Alexandra Listl, Sebastian Maier, Nicole Megow, Marcel Mettler, Daniel Müller-Gritschneder, Hassan Nassar, Fabian Paus, Alexander Pöppl, Behnaz Pourmohseni, Jonas Rabenstein, Phillip Raffeck, Martin Rapp, Santiago Narváez Rivas, Mark Sagi, Franziska Schirrmacher, Ulf Schlichtmann, Florian Schmaus, Wolfgang Schröder-Preikschat, Tobias Schwarzer, Mohammed Bakr Sikal, Bertrand Simon, Gregor Snelting, Jan Spieck, Akshay Srivatsa, Walter Stechele, Jürgen Teich, Furkan Turan, Isaías A. Comprés Ureña, Ingrid Verbauwhede, Dominik Walter, Thomas Wild, Stefan Wildermann, Mario Wille, Michael Witterauf, and Li Zhang. Invasive Computing. FAU University Press, August 16, 2022. [ DOI ] |
[2] | Nidhi Anantharajaiah, Felix Knopf, and Juergen Becker. Ant colony optimization based nocs for flexible spatial isolation in mixed criticality systems. In 2021 IEEE 34th International System-on-Chip Conference (SOCC), pages 248–253, 2021. [ DOI ] |
[3] | Nidhi Anantharajaiah, Zhe Zhang, and Juergen Becker. Multi-layered nocs with adaptive routing for mixed criticality systems. In Applied Reconfigurable Computing. Architectures, Tools, and Applications. Springer International Publishing, 2021. [ DOI ] |
[4] |
Akshay Srivatsa, Nael Fasfous, Nguyen Anh Vu Doan, Sebastian Nagel, Thomas
Wild, and Andreas Herkersdorf.
Exploring a hybrid voting-based eviction policy for caches and sparse
directories on manycore architectures.
Microprocessors and Microsystems, page 104384, 2021.
[ DOI |
http ]
Keywords: Eviction policy, Last-level cache, Sparse directory, Voting theory, DSM system, Manycore architecture |
[5] | Akshay Srivatsa, Mostafa Mansour, Sven Rheindt, Dirk Gabriel, Thomas Wild, and Andreas Herkersdorf. Dynaco: Dynamic coherence management for tiled manycore architectures. International Journal of Parallel Programming, January 2021. [ DOI | http ] |
[6] | Nael Fasfous, Manoj-Rohit Vemparala, Alexander Frickenstein, Mohamed Badawy, Felix Hundhausen, Julian Höfer, Naveen-Shankar Nagaraja, Christian Unger, Hans-Jörg Vögel, Jürgen Becker, Tamim Asfour, and Walter Stechele. Binary-lorax: Low-power and runtime adaptable xnor classifier for semi-autonomous grasping with prosthetic hands. In International Conference on Robotics and Automation (ICRA), 2021. [ http ] |
[7] | Oliver Lenke, Richard Petri, Thomas Wild, and Andreas Herkersdorf. Peperoni: Pre-estimating the performance of near-memory integration. In MEMSYS'21: The International Symposium on Memory Systems, Virtual Conference, 2021. |
[8] | N. Doan, A. Srivatsa, N. Fasfous, S. Nagel, T. Wild, and A. Herkersdorf. On-chip democracy: A study on the use of voting systems for computer cache memory management. In 2020 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), December 2020. |
[9] | A. Srivatsa, S. Nagel, N. Fasfous, N. Doan, T. Wild, and A. Herkersdorf. Hyve: A hybrid voting-based eviction policy for caches. In 2020 IEEE Nordic Circuits and Systems Conference (NorCAS), October 2020. |
[10] | Sven Rheindt, Sebastian Maier, Nora Pohle, Lars Nolte, Oliver Lenke, Florian Schmaus, Thomas Wild, Wolfgang Schröder-Preikschat, and Andreas Herkersdorf. DySHARQ: Dynamic software-defined hardware-managed queues for tile-based architectures. International Journal of Parallel Programming, 2020. [ DOI ] |
[11] | Andreas Herkersdorf. Tackling the mpsoc data locality challenge with regional coherence and near memory acceleration. Keynote talk, 2019 IEEE Nordic Circuits and Systems Conference (NorCAS), October 29, 2019. |
[12] | Andreas Herkersdorf. As embedded systems became serious grown-ups, they decide on their own. Invited Talk at the Workshop on Embedded Systems, Dedicated to Peter Marwedel on the Occasion of his 70th Birthday, Dortmund, July 4, 2019. |
[13] | Nidhi Anantharajaiah, Fabian Kempf, Leonard Masing, Fabian Marc Lesniak, and Juergen Becker. Dynamic and scalable runtime block-based multicast routing for networks on chips. In Proceedings of the 12th International Workshop on Network on Chip Architectures, NoCArc, pages 10:1–10:6, New York, NY, USA, 2019. ACM. [ DOI ] |
[14] | Akshay Srivatsa, Sven Rheindt, Dirk Gabriel, Thomas Wild, and Andreas Herkersdorf. Cod: Coherence-on-demand – runtime adaptable working set coherence for dsm-based manycore architectures. In Dionisios N. Pnevmatikatos, Maxime Pelcat, and Matthias Jung, editors, Embedded Computer Systems: Architectures, Modeling, and Simulation, pages 18–33, Cham, 2019. Springer International Publishing. |
[15] | Sven Rheindt, Sebastian Maier, Florian Schmaus, Thomas Wild, Wolfgang Schröder-Preikschat, and Andreas Herkersdorf. SHARQ: Software-defined hardware-managed queues for tile-based manycore architectures. In Proceedings of the 19th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), pages 212–225. Springer, 2019. [ DOI ] |
[16] | Sven Rheindt, Andreas Fried, Oliver Lenke, Lars Nolte, Thomas Wild, and Andreas Herkersdorf. NEMESYS: Near-memory graph copy enhanced system-software. In MEMSYS 19: The International Symposium on Memory Systems, Washington DC, 2019. |
[17] | Leonard Masing, Akshay Srivatsa, Fabian Kreß, Nidhi Anantharajaiah, Andreas Herkersdorf, and Jürgen Becker. In-NoC circuits for low-latency cache coherence in distributed shared-memory architectures. In 2018 IEEE 12th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC). IEEE, September 2018. [ DOI ] |
[18] | Jürgen Teich. Methodologies for application mapping for noc-based mpsocs. Keynote, Adaptive Many-Core Architectures and Systems workshop, York, UK, June 14, 2018. |
[19] | Andreas Weichslgartner, Stefan Wildermann, Michael Glaß, and Jürgen Teich. Invasive Computing for Mapping Parallel Programs to Many-Core Architectures. Springer, January 15, 2018. [ DOI ] |
[20] | Sven Rheindt, Andreas Schenk, Akshay Srivatsa, Thomas Wild, and Andreas Herkersdorf. CaCAO: Complex and Compositional Atomic Operations for NoC-based Manycore Platforms. In ARCS 2018 - 31st International Conference on Architecture of Computing Systems, Braunschweig, Germany, 2018. |
[21] | Tulika Mitra, Jürgen Teich, and Lothar Thiele. Guest Editors’ Introduction: Special Issue on Time-Critical Systems Design. IEEE Design and Test of Computers, 35:5–7, 2018. [ DOI ] |
[22] | A. Srivatsa, S. Rheindt, T. Wild, and A. Herkersdorf. Region based cache coherence for tiled mpsocs. In 2017 30th IEEE International System-on-Chip Conference (SOCC), September 2017. |
[23] | Stephanie Friederich. Automated Hardware Prototyping for 3D Networks on Chips. Dissertation, Institut für Technik der Informationsverarbeitung, Karlsruhe Institute of Technology (KIT), May 23, 2017. |
[24] | Lukas Meder. Timing Synchronization and Fast-Control for FPGA-based large-scale Readout and Processing Systems. Dissertation, Institut für Technik der Informationsverarbeitung (ITIV), Fakultät für Elektrotechnik und Informationstechnik, Karlsruher Institut für Technologie (KIT), April 1, 2017. |
[25] | Andreas Weichslgartner. Application Mapping Methodologies for Invasive NoC-Based Architectures. Dissertation, Hardware/Software Co-Design, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany, January 24, 2017. |
[26] | Aurang Zaib, Jan Heisswolf, Andreas Weichslgartner, Thomas Wild, Jürgen Teich, Jürgen Becker, and Andreas Herkersdorf. Efficient task spawning for shared memory and message passing in many-core architectures. Journal of Systems Architecture (JSA), 2017. [ DOI ] |
[27] | Soonhoi Ha and Jürgen Teich, editors. The Handbook of Hardware/Software Codesign. Springer, 2017. [ DOI ] |
[28] | Jürgen Teich. Invasive computing – editorial. it – Information Technology, 58(6):263–265, November 24, 2016. [ DOI ] |
[29] | Vivek Singh Bhadouria, Alexandru Tanase, Moritz Schmid, Frank Hannig, Jürgen Teich, and Dibyendu Ghoshal. A novel image impulse noise removal algorithm optimized for hardware accelerators. Journal of Signal Processing Systems, 89(2):225–242, November 1, 2016. [ DOI ] |
[30] | Vahid Lari, Andreas Weichslgartner, Alex Tanase, Michael Witterauf, Faramarz Khosravi, Jürgen Teich, Jürgen Becker, Jan Heißwolf, and Stephanie Friederich. Providing fault tolerance through invasive computing. it – Information Technology, 58(6):309–328, October 19, 2016. [ DOI ] |
[31] | Gabor Drescher, Christoph Erhardt, Felix Freiling, Johannes Götzfried, Daniel Lohmann, Pieter Maene, Tilo Müller, Ingrid Verbauwhede, Andreas Weichslgartner, and Stefan Wildermann. Providing security on demand using invasive computing. it – Information Technology, 58(6):281–295, September 30, 2016. [ DOI ] |
[32] | Stefan Wildermann, Michael Bader, Lars Bauer, Marvin Damschen, Dirk Gabriel, Michael Gerndt, Michael Glaß, Jörg Henkel, Johny Paul, Alexander Pöppl, Sascha Roloff, Tobias Schwarzer, Gregor Snelting, Walter Stechele, Jürgen Teich, Andreas Weichslgartner, and Andreas Zwinkau. Invasive computing for timing-predictable stream processing on MPSoCs. it – Information Technology, 58(6):267–280, September 30, 2016. [ DOI ] |
[33] | Jürgen Teich, Michael Glaß, Sascha Roloff, Wolfgang Schröder-Preikschat, Gregor Snelting, Andreas Weichslgartner, and Stefan Wildermann. Language and compilation of parallel programs for *-predictable MPSoC execution using invasive computing. In Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), pages 313–320, Lyon, France, September 2016. [ DOI ] |
[34] | Stephanie Friederich, Marco Neber, and Jürgen Becker. Power management controller for online power saving in network-on-chips. In International Symposium on Embedded Multicore/Manycore SoCs (MCSoC), volume 10, September 2016. |
[35] | Jürgen Teich. Predictability, fault tolerance, and security on demand using invasive computing. Invited Talk, University of Lübeck, Germany, July 29, 2016. |
[36] | Jürgen Teich. Invasive Computing - The DFG Transregional Research Center 89. DTC 2016, The Munich Workshop on Design Technology Coupling, Munich, Germany, June 30, 2016. |
[37] | Jürgen Teich. Predictable MPSoC stream processing using invasive computing. Seminar Talk, Electrical and Computer Engineering, The University of Texas at Austin, USA, June 6, 2016. |
[38] | Andreas Weichslgartner, Stefan Wildermann, Johannes Götzfried, Felix Freiling, Michael Glaß, and Jürgen Teich. Design-time/run-time mapping of security-critical applications in heterogeneous mpsocs. In Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems (SCOPES), pages 153–162. ACM, May 23, 2016. [ DOI ] |
[39] | Andreas Weichslgartner and Jürgen Teich. Position paper: Towards redundant communication through hybrid application mapping. In Proceedings of the third International Workshop on Multi-Objective Many-Core Design (MOMAC) in conjunction with International Conference on Architecture of Computing Systems (ARCS). IEEE, April 4, 2016. |
[40] | Jan Heisswolf, Stephanie Friederich, Leonard Masing, Aandreas Weichslgartner, Aurang M. Zaib, Carsten Stein, Marco Duden, Jürgen Teich, Thomas Wild, Andreas Herkersdorf, and Jürgen Becker. A novel noc-architecture for fault tolerance and power saving. In Proceedings of the third International Workshop on Multi-Objective Many-Core Design (MOMAC) in conjunction with International Conference on Architecture of Computing Systems (ARCS). IEEE, April 4, 2016. |
[41] | Jürgen Teich. Adaptive restriction and isolation for predictable MPSoC stream procesing. Invited Talk, DATE 2016 Friday Workshop on Resource Awareness and Application Autotuning in Adaptive and Heterogeneous Computing, Dresden, Germany, March 18, 2016. |
[42] | Jürgen Teich. Symbolic loop parallelization for adaptive multi-core systems - recent advances and benefits. Keynote, IMPACT 2016, the 6th International Workshop on Polyhedral Compilation Techniques, 19 January, 2016, Prague, Czech Republic, January 19, 2016. |
[43] | Jürgen Teich. The role of restriction and isolation for increasing the predictability of MPSoC stream processing. Keynote, 8th Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools (RAPIDO 2016), Prague, Czech Republic, January 18, 2016. |
[44] | Stephanie Friederich, Niclas Lehmann, and Jürgen Becker. Adaptive bandwidth router for 3d network-on-chips. In Applied Reconfigurable Computing, pages 352–360, 2016. [ DOI ] |
[45] | Michael Dreschmann, Jan Heisswolf, Michael Geiger, Manuel Haußecker, and Jürgen Becker. A framework for multi-FPGA interconnection using multi gigabit transceivers. In Proceedings of the 28th Symposium on Integrated Circuits and Systems Design (SBCCI), pages 5:1–5:6. ACM, August 2015. [ DOI ] |
[46] | Moritz Schmid. Rapid Prototyping for Hardware Accelerators in the Medical Imaging Domain. Dissertation, Hardware/Software Co-Design, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany, July 24, 2015. |
[47] | Jürgen Teich. Adaptive isolation for predictable mpsoc stream processing. Keynote, SCOPES 2015, 18th International Workshop on Software and Compilers for Embedded Systems, Schloss Rheinfels, St. Goar, Germany, June 2, 2015. |
[48] | Jan Heisswolf, Andreas Weichslgartner, Aurang Zaib, Stephanie Friederich, Leonard Masing, Carsten Stein, Marco Duden, Roman Klöpfer, Thomas Wild, Andreas Herkersdorf, Jürgen Teich, and Jürgen Becker. Fault-tolerant communication in invasive networks on chip. In Proceedings of the 2015 NASA/ESA Conference on Adaptive Hardware and Systems (AHS), pages 1–8. IEEE, June 2015. |
[49] | Stefan Wildermann, Andreas Weichslgartner, and Jürgen Teich. Design methodology and run-time management for predictable many-core systems. In Proceedings of the 6th IEEE Workshop on Self-Organizing Real-Time Systems (SORT), pages 1–8, April 13, 2015. |
[50] | Preethi Parayil, Aurang Zaib, Thomas Wild, Stefan Wallentowitz, and Andreas Herkersdorf. Sharer status-based caching in tiled multiprocessor systems-on-chip. In HPC 2015 – 23rd High Performance Computing Symposia, pages 67–74. SCS, The Society for Modeling & Simulation, April 2015. |
[51] | Jürgen Teich. Invasive computing. Invited Talk, SE 2015, Software Engineering and Management, Special Session Software Engineering in der DFG, Dresden, Germany, March 19, 2015. |
[52] | Andreas Weichslgartner, Jan Heisswolf, Aurang Zaib, Thomas Wild, Andreas Herkersdorf, Jürgen Becker, and Jürgen Teich. Position paper: Towards hardware-assisted decentralized mapping of applications for heterogeneous noc architectures. In Proceedings of the second International Workshop on Multi-Objective Many-Core Design (MOMAC) in conjunction with International Conference on Architecture of Computing Systems (ARCS). IEEE, March 2015. |
[53] | Aurang Zaib, Jan Heisswolf, Andreas Weichslgartner, Thomas Wild, Jürgen Teich, Jürgen Becker, and Andreas Herkersdorf. Network interface with task spawning support for noc-based dsm architectures. In 28th GI/ITG International Conference on Architecture of Computing Systems (ARCS), volume 9017 of Lecture Notes in Computer Science (LNCS), pages 186–198. Springer, 2015. [ DOI ] |
[54] | Christoph Roth. Parallele und kooperative Simulation für eingebettete Multiprozessorsysteme. Dissertation, Institut für Technik der Informationsverarbeitung, Karlsruhe Institute of Technology (KIT), December 2014. [ http ] |
[55] | Jürgen Teich. Reconfigurable computing for mpsoc. Invited Lecture, Winter School Design and Applications of Multi Processor System on Chip, Tunis, Tunesia, November 26, 2014. |
[56] | Jan Heisswolf. A Scalable and Adaptive Network on Chip for Many-Core Architectures. Dissertation, Institut für Technik der Informationsverarbeitung, Karlsruhe Institute of Technology (KIT), November 11, 2014. [ http ] |
[57] | Andreas Weichslgartner, Deepak Gangadharan, Stefan Wildermann, Michael Glaß, and Jürgen Teich. Daarm: Design-time application analysis and run-time mapping for predictable execution in many-core systems. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS 2014), pages 1–10, October 2014. [ DOI ] |
[58] | Jürgen Teich. Invasive computing – concepts and benefits. Keynote, DASIP 2014, Conference on Design and Architectures for Signal and Image Processing, Madrid, Spain, October 8, 2014. |
[59] | Stephanie Friederich, Jan Heisswolf, and Jürgen Becker. Hardware/software debugging of large scale many-core architectures. In Proceedings of the 27th Symposium on Integrated Circuits and Systems Design (SBCCI), pages 1–7. IEEE, September 2014. [ DOI ] |
[60] | Jürgen Teich. Foundations and benefits of invasive computing. Seminar, Mc Gill University, Montreal, July 29, 2014. |
[61] | Jürgen Teich. Introduction to invasive computing. Workshop on Resource Awareness and Adaptivity in Multi-Core Computing (Racing 2014), Paderborn, Germany, Tutorial Talk, May 29, 2014. |
[62] | Jürgen Teich. Foundations and benefits of invasive computing. University of Bologna, Italy, Invited Talk in the Seminar Series Trends in Electronics, May 23, 2014. |
[63] | Jan Heisswolf, Aurang Zaib, Andreas Weichslgartner, Martin Karle, Maximilian Singh, Thomas Wild, Jürgen Teich, Andreas Herkersdorf, and Jürgen Becker. The invasive network on chip - a multi-objective many-core communication infrastructure. In Proceedings of the first International Workshop on Multi-Objective Many-Core Design (MOMAC) in conjunction with International Conference on Architecture of Computing Systems (ARCS). IEEE, February 25, 2014. |
[64] | Jan Heisswolf, Aurang Zaib, Andreas Zwinkau, Sebastian Kobbe, Andreas Weichslgartner, Jürgen Teich, Jörg Henkel, Gregor Snelting, Andreas Herkersdorf, and Jürgen Becker. Cap: Communication aware programming. In 51th ACM/EDAC/IEEE Design Automation Conference (DAC), pages 105:1–105:6, 2014. |
[65] | Stephanie Friederich, Jan Heisswolf, David May, and Jürgen Becker. Hardware prototyping and software debugging of multi-core architectures. In Proceedings of the Synopsys Users Group Conference (SNUG), 2014. |
[66] | Aurang Zaib, Jan Heisswolf, Andreas Weichslgartner, Thomas Wild, Jürgen Teich, Jürgen Becker, and Andreas Herkersdorf. Auto-gs: Self-optimization of noc traffic through hardware managed virtual connections. In Proceedings of the 16th Euromicro Conference on Digital System Design (DSD), pages 761–768. IEEE, September 2013. [ DOI ] |
[67] | J. Heisswolf, S. Bischof, M. Rueckauer, and Jürgen Becker. Efficient memory access in 2d mesh noc architectures using high bandwidth routers. In Proceedings of the 26th Symposium on Integrated Circuits and Systems Design (SBCCI), pages 1–6, September 2013. [ DOI ] |
[68] | Jan Heisswolf, Aurang Zaib, Andreas Weichslgartner, Ralf König, Thomas Wild, Jürgen Teich, Andreas Herkersdorf, and Jürgen Becker. Virtual networks – distributed communication resource management. ACM Trans. Reconfigurable Technol. Syst., 6(2):8:1–8:14, August 2013. [ DOI ] |
[69] | Sascha Roloff, Andreas Weichslgartner, Jan Heißwolf, Frank Hannig, and Jürgen Teich. NoC simulation in heterogeneous architectures for PGAS programming model. In Proceedings of the 16th International Workshop on Software and Compilers for Embedded Systems (M-SCOPES), pages 77–85. ACM, June 2013. [ DOI ] |
[70] | Jan Heisswolf, Andreas Weichslgartner, Aurang Zaib, Ralf König, T. Wild, A. Herkersdorf, Jürgen Teich, and Jürgen Becker. Hardware supported adaptive data collection for networks on chip. In Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum (IPDPSW), pages 153–162, May 2013. [ DOI ] |
[71] | C. Pham, J. Heisswolf, S. Wenner, Z. Al-Ars, J.A. Becker, and K.L.M. Bertels. Hybrid interconnect design for heterogeneous hardware accelerators. In Proceedings of Design, Automation and Test in Europe Conference (DATE), pages 843–846, March 2013. [ DOI ] |
[72] | Jürgen Teich. Safe(r) loop computations on multi-cores. Invited Talk, 2nd Workshop on Design Tools and Architectures for Multi-Core Embedded Computing Platforms (DITAM 2013), Berlin, Germany, January 22, 2013. |
[73] | Jan Heisswolf, Maximilian Singh, Martin Kupper, Ralf Koenig, and Juergen Becker. Rerouting: Scalable noc self-optimization by distributed hardware-based connection reallocation. In Proceedings of the International Conference on Reconfigurable Computing and FPGAs (ReConFig), 2013. |
[74] | Jan Heisswolf, Ralf König, M. Kupper, and Jürgen Becker. Multiple hard latency and throughput guarantees for packet switching networks on chip. Computers & Electrical Engineering, 2013. [ DOI ] |
[75] | Christian Schuck. Design and Synthesis of Organic Computing Hardware Architectures. Dissertation, Institut für Technik der Informationsverarbeitung (ITIV), Fakultät für Elektrotechnik und Informationstechnik, Karlsruher Institut für Technologie (KIT), July 10, 2012. |
[76] | Jan Heisswolf, Ralf König, and Jürgen Becker. A scalable noc router design providing qos support using weighted round robin scheduling. In Parallel and Distributed Processing with Applications (ISPA), 2012 IEEE 10th International Symposium on, pages 625–632, July 2012. [ DOI ] |
[77] | Matthias Kühnle. IP-based Reconfigurable System-on-Chip Design and Synthesis. Dissertation, Institut für Technik der Informationsverarbeitung (ITIV), Fakultät für Elektrotechnik und Informationstechnik, Karlsruher Institut für Technologie (KIT), June 6, 2012. |
[78] | Jan Heisswolf, Aurang Zaib, Andreas Weichslgartner, Ralf König, Thomas Wild, Jürgen Teich, Andreas Herkersdorf, and Jürgen Becker. Hardware-assisted decentralized resource management for networks on chip with qos. In Proceedings of the 2012 IEEE 26th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum (IPDPSW), pages 234–241, Shanghai, China, May 2012. [ DOI ] |
[79] | Jörg Henkel, Andreas Herkersdorf, Lars Bauer, Thomas Wild, Michael Hübner, Ravi Kumar Pujari, Artjom Grudnitsky, Jan Heisswolf, Aurang Zaib, Benjamin Vogel, Vahid Lari, and Sebastian Kobbe. Invasive manycore architectures. In Proceedings of the 17th Asia and South Pacific Design Automation Conference (ASP-DAC), pages 193–200, January 2012. [ DOI ] |
[80] | Jürgen Becker, Stephanie Friederich, Jan Heisswolf, Ralf Koenig, and David May. Hardware prototyping of novel invasive multicore architectures. In Proceedings of the 17th Asia and South Pacific Design Automation Conference (ASP-DAC), pages 201–206, January 2012. [ DOI ] |
[81] | Alexander Klimm. Computing Architectures for Security Applications on Reconfigurable Hardware in Embedded Systems. Dissertation, Institut für Technik der Informationsverarbeitung (ITIV), Fakultät für Elektrotechnik und Informationstechnik, Karlsruher Institut für Technologie (KIT), December 22, 2011. |
[82] | Andreas Weichslgartner, Stefan Wildermann, and Jürgen Teich. Dynamic decentralized mapping of tree-structured applications on NoC architectures. In Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip (NOCS), pages 201–208, May 2011. [ DOI ] |
[83] | Jürgen Teich, Jörg Henkel, Andreas Herkersdorf, Doris Schmitt-Landsiedel, Wolfgang Schröder-Preikschat, and Gregor Snelting. Invasive computing: An overview. In Michael Hübner and Jürgen Becker, editors, Multiprocessor System-on-Chip – Hardware Design and Tool Integration, pages 241–268. Springer, Berlin, Heidelberg, 2011. [ DOI ] |
[84] | Jürgen Teich. Invasive algorithms and architectures. it - Information Technology, 50(5):300–310, 2008. |