B4: Hardware Monitoring System and Design Optimisation for Invasive Architectures
Principal Investigators:
Prof. Schmitt-Landsiedel, Prof. Schlichtmann
Scientific Researchers:
Qingqing Chen, Elisabeth Glocker, Shushanik Karapetyan, Daniel Müller-Gritschneder, Christoph Werner, Martin Wirnshofer
Abstract
Subproject B4 is dedicated to the assessment of operating conditions
of the invasive computing hardware, the communication of this information and
optimisation of the required monitoring resources. To measure these
parameters, monitor circuits are designed and their number and placement in
the invasive architecture is optimised. A parameterised model for this status
information, including power consumption, temperature and maximum possible
performance of computing blocks as well as their ageing-related degradation
status, is provided for system simulation, optimisation and emulation on the
FPGA-demonstrator.
In the first funding phase, Project B4 has developed concepts for monitoring invasive computing systems (both RISC and TCPA tiles). Specifically,
concepts for monitoring power, temperature and ageing have been investigated. Communication interfaces between the monitors and higher
levels of invasive computing systems have been explored. A control loop concept has been developed. For the essential monitoring concepts,
a method has been developed to emulate them on an FPGA. The major challenge for FPGA emulation was that most monitors contain analogue circuits.
With the achieved FPGA emulation, our concepts can be evaluated in the context of an entire invasive computing system even without an ASIC hardware implementation.
Synopsis
Integrated circuits today and even more in the future are subject
to significant variations—between different manufactured components (resulting
from fluctuations in the manufacturing process) as well as over space (e.g.,
"hot spots" due to heavy local switching activity) and time (short-term resulting
from fluctuations in the operating conditions such as supply voltage
and temperature; long-term resulting from degradation effects due to ageing).
Therefore, different processing elements even on the same invasive IC
can exhibit significantly different processing capabilities and susceptibility to
degradation resulting from processing loads. This also results in differing risk
of IC failures.
Resource-aware programming as one of the most essential points of innovation
of invasive computing shall enable an application to make its decisions for
execution based on actual physical hardware properties. In order to allow
invasive algorithms to exploit the state of the invasive hardware for optimal
distribution of the load, this subproject will provide means to measure and
communicate the specific status of a processing element. This requires new
ways of hardware design optimisation specifically in view of the new capabilities
of invasion, including the design of dedicated monitor circuits. This
project considers optimisation strategies and design of corresponding circuits
and interfaces including the demonstration by simulation, emulation on the
FPGA hardware prototype platform and later on by implementation on ASIC
hardware prototypes.
This comprises classification of potential monitor types and interfacing
systems, circuit design and analysis. It also includes algorithmic analysis and
optimisation with respect to the complete invasive system, to calibrate monitors
and to optimise their number, their performance regarding accuracy and
speed, and their placement. Interfacing and information propagation has to
be optimised to ensure best possible utilisation of each processor block based
on its individual capabilities. This will also potentially reduce manufacturing
costs of invasive architecture ICs, as processor blocks can be utilised according
to their individual capabilities, rather than having to discard processors that
do not meet predefined performance requirements.
The activities in the first funding phase will also lay the groundwork to
enable optimised invasive architecture implementations in ASICs in the second
and third funding phase, by optimising power consumption and reducing
susceptibility to manufacturing variations and age-dependent degradation.
Research goal in first funding phase
With the introduced resource awareness of an invasive computing system, applications have the ability to explore the system and make decisions for execution (e.g. number and selection of invaded cores) based on the current state of the hardware platform, including physical hardware properties. For realising an invasive multi-tile architecture, a closed-loop control system between applications, run-time support system (OctoPOS), agent system and the underlying hardware including the monitoring system is necessary. The monitoring system provides the system with the needed monitoring data to control the physical hardware conditions and to use knowledge about hardware-health during the resource allocation in the invade phases. This becomes even more important (especially with thousands or more processors integrated on a single chip) when considering the significantly different processing capabilities and susceptibility to degradation of modern integrated circuits as compared to older and more robust processes. So, the research goal of Project B4 that will be fulfilled at the end of Phase I has been to measure the specific status of hardware elements, preprocess these data and implement the overall monitoring system. To effectively monitor the invasive hardware, different parameters, such as temperature evolution, power consumption and maximum and age-dependent performance capability have to be monitored. A foundation for this was developed in Phase I. The monitoring of the latter two parameters will be implemented in Phase II. This information is communicated with different levels of detail to other system components: higher hardware layers, run-time support system (OctoPOS), agent system and applications. The system is then able to act considering the monitor information, e.g. during the invade phase to choose appropriate processing elements or to react, if a critical status is detected. In turn, these actions may influence the status of hardware elements and with that the measured monitoring data. Project B4 has considered optimisation strategies and the design of corresponding interfaces, including the demonstration by simulation and emulation on the FPGA hardware prototyping platform.
Methods
Hardware-monitoring concept and models:
In-situ delay monitoring: In [Wirnshofer, ISIC 2011] and [Wirnshofer, DDECS 2011], we have demonstrated the monitoring
of the maximum possible performance in terms of speed and frequency by in-situ delay monitoring.
Before Phase I, published performance/speed monitors were mostly critical path replicas [A. Drake, et al., "A distributed critical-path timing monitor for a 65nm high-performance microprocessor, ISSCC 2007].
In [Wirnshofer, DDECS 2012] we have demonstrated the use of in-situ delay monitors for use in adaptive voltage
scaling (AVS) and have evaluated the performance improvement and power saving potential. In-situ delay
monitors are enhanced flip-flops that observe the timing of the circuit. Critical, but not yet erroneous signal
transitions are detected as pre-errors. The pre-error rate is used as indicator whether the remaining timing
slack of the circuit is sufficient. By use of these in-situ delay monitors, all kinds of variation and ageing
effects are detected inside the real circuit and thus reliable performance information is provided. When
using this monitor type in an online AVS technique, the supply voltage can be regulated during normal
circuit operation—without a need for test intervals. In [Aryan, ARS 2012], different designs to implement in-situ
delay monitors have been presented and the reliability of the timing information as well as the power
overhead have been carefully analysed.
Ageing monitoring: Before Phase I, monitors that determine more advanced system features (e.g. ageing
status) were just attracting initial research efforts in the research community.
In [Lorenz, Tech. Report, 2011], we have demonstrated an innovative approach to periodically monitor the ageing of ICs
during operation. The basic concept is to identify all paths that potentially might become critical during the
lifetime of an IC. As different paths can age at different rates, the critical path can change during the life of
an IC. Ageing depends on operating and environmental conditions and therefore cannot be determined
exactly before an IC is actually being used. But it is possible to identify a range within which the delay of a
path will always be, regardless of where specifically it resides within the manufacturing window and what
operating conditions (temperature, supply voltage, or switching activity) it will experience. It turns out
that if this window is considered, for many circuits the number of paths that can potentially become critical
is reduced significantly, often by one or two orders of magnitude. Therefore, it appears to be an option to
test these paths periodically during the operation of an IC to detect any ageing that might endanger correct
computation. This approach can be considered as a supplement or an alternative to the methods discussed
above.
The research presented in [Knoth, PATMOS 2011], [Knoth, DATE 2012], [Chen, PATMOS 2011], [Li, TCAD 2013], [Li, TCAD 2012] and [Chen, IET CDS 2012] addresses
related topics. This will become especially useful for a future ASIC design of the invasive multi-tile
architectures.
In [Knoth, PATMOS 2011], SWAT, a highly optimised statistical timing analyser for digital circuits has been presented
that combines the accuracy of a transistor-level analysis with the performance of a gate-level analysis.
SWAT is based upon a CSM (current source model) for logic cells which considers transistor ageing and
process variation and employs waveform truncation and dedicated solvers to significantly improve analysis
performance without noticeable loss of accuracy. Parameter variations and ageing can be handled by
Monte Carlo simulations and by a special sensitivity propagation mode, which expresses arrival times as a
function of local and global parameter variations. This will allow very fast, yet accurate analysis of an
ASIC design, considering variations and ageing to ensure very robust InvasIC ASIC design. In [Knoth, DATE 2012], the
emphasis is put on power analysis instead of timing analysis.
In [Chen, PATMOS 2011], a flip-flop timing model has been presented that allows interdependency of different
computation stages to be analysed via a static timing analysis at gate level. This is done by breaking the
timing boundaries by explicitly building the functional relationship between clock-to-q delay and timing
parameters at the flip-flop data input. Ageing effects HCI (hot carrier injection) and NBTI (negative bias
temperature instability) are also considered in the modelling to pave the way for precise and realistic
ageing analysis. Application of this approach in system emulation and later on also ASIC design will
improve design performance even further.
[Li, TCAD 2013] has investigated the challenges in hierarchical timing analysis considering process variations.
With abstract statistical timing models containing interfacing constraints, this flow can reduce the
complexity of design and verification of large SoC systems effectively. For each of the three basic circuit
types (combinational, flip-flop-based and latch-controlled) methods to extract statistical timing models are
proposed to prune the unnecessary timing information from the underlying modules. With additional
methods for the reconstruction of correlation between modules and for system-level verification, the
complete framework is several times faster than analysing the flattened circuit directly, therefore providing
an efficient flow for statistical timing verification of invasive multi-tile architectures.
[Li, TCAD 2012] has evaluated the statistical timing performance of circuits with level-sensitive latches, which
are widely used in high-performance designs, such as CPUs. Circuits of this type, however, impose more
complexity in timing analysis due to latch transparency. With reduced iterations and graph transformations,
the proposed method extracts setup-time constraints at latches and across sequential loops very efficiently,
more than ten times faster than other state-of-the-art methods, while still maintaining a good accuracy in
the computed minimum clock period in a parametric form. The proposed method contributes a fast tool for
statistical timing evaluation in the optimisation iterations of invasive computing systems, in which the
aforementioned latch circuits always serve as the source of flexibility and robustness.
[Chen, IET CDS 2012] has introduced a modelling framework for the timing behaviour of a flipflop by building a
nonlinear functional relationship between the clock-to-q delay and the data/clock alignment. The proposed
framework makes it possible to carry out static timing analyses at gate level taking into consideration the
interdependency of different computation stages. An iterative timing analysis method is developed to find
out whether a circuit can work at a given clock frequency and to determine the minimal acceptable clock
period of the circuit. The new method will be able to further improve the performance and the yield of
the ASIC design for invasive multi-tile architectures, especially when process variations and ageing are
considered.
Implementation and emulation for FPGA demonstrator platform:
Since it has been decided that there will
be no full InvasIC ASIC implementation, our focus for the first funding phase has changed somewhat:
Instead of preparing the hardware demonstrator (ASIC implementation) as originally proposed, we have
worked on the modelling, implementation and emulation of the monitors on the FPGA demonstrator
platform, to enable an FPGA emulation an invasive multi-tile architecture in close cooperation with the
whole CRC, especially together with Project B2 and Project B3.
Before the start of Phase I, monitoring of parameters such as power consumption, temperature,
performance (in terms of speed or maximum operating frequency) was already state-of-the-art in modern
high performance microprocessors, [Duarte et al., "Temperature sensor design in a high volume manufacturing 65nm CMOS digital process", CICC 2007] and
[Tschanz et al. "Adaptive frequency and biasing techniques for tolerance to dynamic temperature-voltage variations and aging", ISSCC 2007].
These monitor data were used, e.g. for
power limitation by frequency or supply voltage control or for complete shutdown of processing elements
to prevent damage [Rotem et al. "Temperature measurement in the Intel CoreTM duo processor", 2006].
But they were not used to control and optimise the complete system. So, no
sophisticated interaction between the physical parameter level and run-time support system or application
layer were present in existing processors.
In the invasive computing architecture, hardware monitors, which are a necessary part of the resource
management feedback control loop, have been included. Consequently those hardware monitors must also
be included in the prototyping system. Hardware monitors such as processor core load, communication
link load (e.g. AHB bus load and iNoC load) and memory access (e.g. cache miss rate) monitors that
are fully digital circuits are easy to implement using the digital logic resources of an FPGA. However,
other hardware monitors that are usually realised as analogue circuits are difficult to implement in the
prototyping system, since our FPGA demonstrator platform, the Synopsys CHIPit system, is based on digital
FPGA technology without any reconfigurable analogue circuit resources. Therefore, for FPGA prototyping,
we have taken a real-time emulation approach for such analogue monitors including power monitors,
temperature monitors and subsequently will take this approach also for ageing monitors in Phase II.
The figure below shows the structure of the implemented circuit of our real-time emulation approach for
power and temperature monitoring.
Power monitoring and emulation on FPGA: Power monitors for processor cores of the RISC compute
tiles, i.e. the LEON3 cores, have been emulated using a run-time instruction-energy look-up approach:
An instruction-energy look-up table (LUT)—containing pre-characterised average energy consumption
values for each kind of processor instruction—is looked up when a new incoming instruction is executed by
a processor core. For a predefined time period (in accordance with monitoring frequency), the energy
values per instruction are accumulated, and the accumulated value is divided by time at the end of the
period to produce the power value for that period. Power monitor emulations for tightly-coupled processor
arrays (TCPAs) have taken a different approach than those for LEON3 cores, since TCPA processing
elements (PEs) are based on a VLIW architecture supporting instruction level parallelism (ILP). Therefore,
a simple energy LUT construction and fast instruction type determination at run time are not feasible,
and thus an event- counter-based energy model has been applied: Pre-characterised energy consumption
values for subprocessor modules (e.g. ALU, register file and instruction decoder) are summed up and
accumulated based on event counter status. Same as for LEON3 cores, the accumulated values are divided
by time to produce power values for a predefined time period. The emulated real-time power consumption
information as well as the accumulated energy data can be communicated to higher system levels not only
for observation purposes and the evaluation of power and energy management strategies, but the power
values are also used for the emulation of temperature monitors.
Temperature monitoring and emulation on FPGA: For the real-time emulation of the temperature monitor
for RISC tiles, an approach that is based on the use of a power-temperature look-up table has been used.
The LUT contains the resulting steady-state temperature for all possible power consumption values (for a
predefined time period) received from the power monitor. Those temperature values are pre-characterised
based on a thermal RC model: In this approach, the input power leads to a temperature difference because
of thermal resistances (modelling steady-state behaviour) and thermal capacitances (modelling transient
behaviour) that both describe the processor architecture environment. The temperature values for the LUT
are obtained for every core (as the maximum steady-state temperature of the cores) taking all possible
average power values and all possible placements for "active" cores into account and considering not only
the core's own activity leading to a specific temperature value, but also the influence of neighbour core
activities on this temperature. These results are mapped to LUT entries and are used to obtain the resulting
steady-state temperature for every core for the predefined time period (in accordance with the monitoring
frequency). For TCPA tiles, the same approach has been applied. But the different architecture has made it
necessary to use another thermal RC model and different power consumption values.
To the best of our knowledge, our approach has been the first one that deals with real-time FPGA emulation
of such a power and temperature monitoring system.
We presented our approach for temperature and power monitoring and emulation of FPGA for RISC tiles in [Glocker, RACING 2014] and [Glocker, Workshop Analogschaltungen 2014].
In-situ delay monitors on FPGA: In-situ delay monitors are novel hardware monitors which can be used to
monitor and predict the reliability of the monitored circuit. It is basically possible to implement them in
FPGA. However, since in an FPGA ageing phenomena do not take effect in reasonable test time and under
normal operating conditions (i.e. temperature and supply voltage), accelerated ageing would have to
be applied to the prototyping system, which is by no means an easy task for a CHIPit system. A simple
solution will again be "emulation". For this, we have developed in cooperation with industry proprietary
models for ageing in dependence on time and operating conditions in another project. We intend to employ
these in the second funding phase and thus have realistic data for an integrated circuit solution available.
Integration and optimisation of individual monitor types on FPGA demonstrator platform:
Integration and optimisation of power and temperature emulation in RISC tile:
Digital hardware monitors such as processor
core load monitors and emulated analogue monitors including power and temperature monitors have been
integrated with CiC for RISC tiles in cooperation with Project B3. The number and placement of the
power and temperature monitors within the overall system has been optimised such that every core of a
tile has one power monitor and each tile has one temperature monitor covering all cores of a tile (giving a
maximum temperature for the complete core). The time period at which the monitors operate is predefined
according to the monitoring frequency.
Integration and optimisation of power and temperature emulation in TCPA tile:
For power and temperature monitoring, the monitoring system will not cover every PE
present on a tile, but rather cover PE regions to keep the size of the monitoring system as small as possible
and to still retrieve useful and sufficiently precise results. In [Glocker, ARS 2014] we presented the approach for temperature monitor modeling and emulation for TCPAs.
Optimisation of overall monitoring system on FPGA demonstrator platform: Feedback control loop of the monitoring system:
Before Phase I, the systematic optimisation of a monitoring system in terms of circuit
types, required resolution, speed and monitoring frequency as well as their number and placement has not
been subject of systematic research efforts. Also, we have not been aware of techniques that allow the
calibration of a generic monitor to a specific design and use case.
Use of monitoring data for resource allocation: We have studied possible improvements that can be made if
monitoring data are used during resource allocation to achieve different control targets. Taking temperature
monitoring data for example, different task allocation techniques and application characteristics as well as
different physical conditions such as package types, material parameters and cooling all result in different
temperature scenarios. Also, reasonably priced processor packaging do not cover the worst case temperature
hot spot scenario anymore, which would occur without an intelligent power and temperature monitoring
and control as proposed here. So, hot spot temperatures must be avoided, e.g. by intelligent task allocation.
In [Glocker, ARS 2013], we have modelled different scenarios in a multicore system and evaluated the temperature
distribution of cores. In a multicore system, a reciprocal influence between the core temperatures of
neighbouring cores occurs, so an intelligent active core placement in a non-full-usage scenario can further
decrease the current temperature. We also evaluated different temperature limiting measures: The best
choice is either an intelligent core choice—resulting from intelligent resource allocation—combined with
lower usage-rates or lowering of the power consumption, e.g. by implementing supply voltage or frequency
scaling. Since temperature should be regulated during run time, a combined implementation of different
concepts and choosing a temperature limiting measure for the individual situation during run time appears
to be the best solution.
Communication of monitor data/feedback control loop of the monitoring system: Instead of communicating
monitor data of every monitor type through the whole invasive computing system, the monitoring data
is "bundled": For using monitoring data for resource allocation and monitoring of the current hardware
health, the monitoring data has to be given to the agent system—included in the run-time support
system—that handles inter-tile resource allocation. The feedback control loop is shown in the figure below for
a sample RISC compute tile.
In InvadeX10, several application classes (such as, e.g. high performance, communication intense, high reliability) have been defined in cooperation with Project A1, Project B2, Project B3, Project C1, Project D1 and Project D3, so that an application can express to which class it belongs This is important for realising the inter-tile resource allocation that fulfils the application needs. In every application class monitor data of different monitor types are bundled, abstracted and weighted in a way to fit the needs of the individual application class. The tile-local resource allocation is done for RISC compute tiles in the CiC. For TCPA compute tiles, the tile-local resource allocation is done by a Configuration & Communication Processor. Monitor data is also abstracted and weighted for tile-local resource allocation.
Publications
[1] | Nidhi Anantharajaiah, Tamim Asfour, Michael Bader, Lars Bauer, Jürgen Becker, Simon Bischof, Marcel Brand, Hans-Joachim Bungartz, Christian Eichler, Khalil Esper, Joachim Falk, Nael Fasfous, Felix Freiling, Andreas Fried, Michael Gerndt, Michael Glaß, Jeferson Gonzalez, Frank Hannig, Christian Heidorn, Jörg Henkel, Andreas Herkersdorf, Benedict Herzog, Jophin John, Timo Hönig, Felix Hundhausen, Heba Khdr, Tobias Langer, Oliver Lenke, Fabian Lesniak, Alexander Lindermayr, Alexandra Listl, Sebastian Maier, Nicole Megow, Marcel Mettler, Daniel Müller-Gritschneder, Hassan Nassar, Fabian Paus, Alexander Pöppl, Behnaz Pourmohseni, Jonas Rabenstein, Phillip Raffeck, Martin Rapp, Santiago Narváez Rivas, Mark Sagi, Franziska Schirrmacher, Ulf Schlichtmann, Florian Schmaus, Wolfgang Schröder-Preikschat, Tobias Schwarzer, Mohammed Bakr Sikal, Bertrand Simon, Gregor Snelting, Jan Spieck, Akshay Srivatsa, Walter Stechele, Jürgen Teich, Furkan Turan, Isaías A. Comprés Ureña, Ingrid Verbauwhede, Dominik Walter, Thomas Wild, Stefan Wildermann, Mario Wille, Michael Witterauf, and Li Zhang. Invasive Computing. FAU University Press, August 16, 2022. [ DOI ] |
[2] | Marcel Mettler, Martin Rapp, Heba Khdr, Daniel Mueller-Gritschneder, Jörg Henkel, and Ulf Schlichtmann. An fpga-based approach to evaluate thermal and resource management strategies of many-core processors. ACM Trans. Archit. Code Optim., 19(3), may 2022. |
[3] | Alexandra Listl, Daniel Mueller-Gritschneder, and Ulf Schlichtmann. Application-aware aging analysis and mitigation for sram design-for-relability. Microelectronics Reliability, 134:114548, 2022. [ DOI | http ] |
[4] | Grace Li Zhang, Bing Li, Ying Zhu, Tianchen Wang, Yiyu Shi, Xunzhao Yin, Cheng Zhuo, Huaxi Gu, Tsung-Yi Ho, and Ulf Schlichtmann. Robustness of neuromorphic computing with rram-based crossbars and optical neural networks. In ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, Tokyo, Japan, January 18-21, 2021, pages 853–858. ACM, 2021. [ DOI | http ] |
[5] | Marcel Mettler, Daniel Mueller-Gritschneder, and Ulf Schlichtmann. A Distributed Hardware Monitoring System for Runtime Verification on Multi-tile MPSoCs. ACM Transactions on Architecture and Code Optimization (TACO), December 2020. |
[6] | Alexandre Truppel, Tsun-Ming Tseng, and Ulf Schlichtmann. PSION 2: Optimizing Physical Layout of Wavelength-Routed ONoCs for Laser Power Reduction. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November 2020. |
[7] | Alexandre Truppel, Tsun-Ming Tseng, Davide Bertozzi, José Carlos Alves, and Ulf Schlichtmann. PSION+: Combining logical topology and physical layout optimization for Wavelength-Routed ONoCs. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020. |
[8] | Marcel Mettler, Daniel Mueller-Gritschneder, and Ulf Schlichtmann. Runtime monitoring of inter- and intra-thread requirements on embedded mpsocs. In Proceedings of the 33rd International Conference on VLSI Design and 19th International Conference on Embedded Systems (VLSID), January 2020. [ DOI ] |
[9] | Grace Li Zhang, Michaela Brunner, Bing Li, Georg Sigl, and Ulf Schlichtmann. Timing resilience for efficient and secure circuits. In Proceedings of the 25th Asia and South Pacific Design Automation Conference (ASP-DAC), January 2020. [ DOI ] |
[10] | Alexandra Listl, Daniel Mueller-Gritschneder, and Ulf Schlichtmann. MAGIC: A Wear-leveling Circuitry to Mitigate Aging Effects in Sense Amplifiers of SRAMs. In 2019 IEEE 17th International New Circuits and Systems Conference (NEWCAS), July 2019. |
[11] | Ulf Schlichtmann and Li Zhang. Machine learning approaches for efficient design space exploration of application-specific nocs. Invited Talk at Xidian University, China, June 22, 2019. |
[12] | Alexandra Listl, Daniel Mueller-Gritschneder, Ulf Schlichtmann, and Sani Nassif. Sram design exploration with integrated application-aware aging analysis. In Design, Automation, and Test in Europe (DATE), pages 1249–1252, March 2019. |
[13] | Daniel Mueller-Gritschneder. Advanced Virtual Prototyping and Communication Synthesis for Integrated System Design at Electronic System Level. Habilitation, Technical University of Munich, 2019. |
[14] | Alexandra Listl, Daniel Mueller-Gritschneder, Fabian Kluge, and Ulf Schlichtmann. Emulation of an asic power, temperature and aging monitor system for fpga prototyping. In International On-Line Testing Symposium (IOLTS), July 2018. |
[15] | Grace Li Zhang, Bing Li, Jinglan Liu, Yiyu Shi, and Ulf Schlichtmann. Design-phase buffer allocation for post-silicon clock binning by iterative learning. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, volume 37, 2018. |
[16] | Grace Li Zhang, Bing Li, Yiyu Shi, Jiang Hu, and Ulf Schlichtmann. Effitest2: Efficient delay test and prediction for post-silicon clock skew configuration under process variations. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018. |
[17] | Li Zhang. Advanced Timing for High-Performance Design and Security of Digital Circuits. Dissertation, Technical University of Munich, 2018. |
[18] | E. Glocker, Q. Chen, U. Schlichtmann, and D. Schmitt-Landsiedel. Emulation of an asic power and temperature monitoring system (etpmon) for fpga prototyping. Microprocessors and Microsystems, 50:90–101, May 2017. [ DOI ] |
[19] | Shushanik Karapetyan and Ulf Schlichtmann. 20nm finfet-based sram cell: Impact of variability and design choices on performance characteristics. In Int. Conf. Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD), 2017. |
[20] | Elisabeth Glocker. Thermisches Verhalten und emuliertes online Temperatur-Monitorsystem für das FPGA-Prototyping von Multiprozessor-Architekturen. Dissertation, Technical University of Munich, 2017. |
[21] | Jinglan Liu, Yukun Ding, Jianlei Yang, Ulf Schlichtmann, and Yiyu Shi. Generative adversarial network based scalable on-chip noise sensor placement. In 30th IEEE International System-on-Chip Conference, SOCC 2017, Munich, Germany, September 5-8, 2017, pages 239–242, 2017. [ DOI ] |
[22] | Santiago Pagani, Lars Bauer, Qingqing Chen, Elisabeth Glocker, Frank Hannig, Andreas Herkersdorf, Heba Khdr, Anuj Pathania, Ulf Schlichtmann, Doris Schmitt-Landsiedel, Mark Sagi, Éricles Sousa, Philipp Wagner, Volker Wenzel, Thomas Wild, and Jörg Henkel. Dark silicon management: An integrated and coordinated cross-layer approach. it – Information Technology, 58(6):297–307, September 16, 2016. [ DOI ] |
[23] | U. Schlichtmann. The next frontier in ic design: Determining (and optimizing) robustness and resilience of integrated circuits and systems. In 2016 China Semiconductor Technology International Conference (CSTIC), pages 1–4, March 2016. [ DOI ] |
[24] | Grace Li Zhang, Bing Li, and Ulf Schlichtmann. Effitest: Efficient delay test and statistical prediction for configuring post-silicon tunable buffers. In Proceedings of the 53rd Annual Design Automation Conference (DAC), pages 60:1–60:6. ACM, 2016. [ DOI ] |
[25] | Ulf Schlichtmann, Masanori Hashimoto, Iris Hui-Ru Jiang, and Bing Li. Reliability, adaptability and flexibility in timing: Buy a life insurance for your circuits. In IEEE/ACM Asia and South Pacific Design Automation Conference (ASP-DAC), pages 705–711. IEEE/ACM Press, January 2016. [ DOI ] |
[26] | Bing Li and U. Schlichtmann. Statistical timing analysis and criticality computation for circuits with post-silicon clock tuning elements. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 34(11):1784–1797, November 2015. [ DOI ] |
[27] | Éricles R. Sousa, Frank Hannig, Jürgen Teich, Qingqing Chen, and Ulf Schlichtmann. Runtime adaptation of application execution under thermal and power constraints in massively parallel processor arrays. In Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems (SCOPES), pages 121–124. ACM, June 2015. [ DOI ] |
[28] | E. Glocker, Q. Chen, A.M. Zaidi, U. Schlichtmann, and D. Schmitt-Landsiedel. Emulation of an ASIC power and temperature monitor system for FPGA prototyping. In Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2015 10th International Symposium on, pages 1–8, June 2015. [ DOI ] |
[29] | Elisabeth Glocker, Qingqing Chen, Asheque M. Zaidi, Ulf Schlichtmann, and Doris Schmitt-Landsiedel. Emulated ASIC Power and Temperature Monitor System for FPGA Prototyping of an Invasive MPSoC Computing Architecture. In Proceedings of the First Workshop on Resource Awareness and Adaptivity in Multi-Core Computing (Racing 2014), pages 14–15, May 2014. [ arXiv ] |
[30] | Nasim Pour Aryan, A. Listl, L. Heiss, C. Yilmaz, G. Georgakos, and D. Schmitt-Landsiedel. From an analytic NBTI device model to reliability assessment of complex digital circuits. In International On-Line Testing Symposium (IOLTS), pages 19–24, 2014. |
[31] | Dominik Lorenz, Martin Barke, and Ulf Schlichtmann. Monitoring of aging in integrated circuits by identifying possible critical paths. Microelectronics Reliability, 54:1075 – 1082, 2014. [ DOI ] |
[32] | Veit B. Kleeberger, Martin Barke, Christoph Werner, Doris Schmitt-Landsiedel, and Ulf Schlichtmann. A compact model for NBTI degradation and recovery under use-profile variations and its application to aging analysis of digital integrated circuits. Microelectronics Reliability, 54(6–7):1083–1089, 2014. [ DOI ] |
[33] | E. Glocker, S. Boppu, Q. Chen, U. Schlichtmann, J. Teich, and D. Schmitt-Landsiedel. Temperature modeling and emulation of an ASIC temperature monitor system for Tightly-Coupled Processor Arrays (TCPAs). Advances in Radio Science, 12:103–109, 2014. [ DOI ] |
[34] | Elisabeth Glocker, Qingqing Chen, Asheque M. Zaidi, Ulf Schlichtmann, and Doris Schmitt-Landsiedel. Emulierung eines ASIC-Leistungsverbrauchs- und Temperaturmonitorsystems für FPGA-Prototyping eines ressourcengewahren Computersystems. In 16. Workshop Analogschaltungen, Wien, Österreich, 2014. |
[35] | Elisabeth Glocker, Srinivas Boppu, Qingqing Chen, Ulf Schlichtmann, Jürgen Teich, and Doris Schmitt-Landsiedel. Temperature modeling and emulation of an ASIC temperature monitor system for Tightly-Coupled Processor Arrays (TCPAs) on FPGA. In Kleinheubacher Tagung 2013, September 2013. |
[36] | Martin Barke, Veit B. Kleeberger, Christoph Werner, Doris Schmitt-Landsiedel, and Ulf Schlichtmann. Analysis of Aging Mitigation Techniques for Digital Circuits Considering Recovery Effects. In edaWorkshop, May 2013. |
[37] | Bing Li, Ning Chen, Yang Xu, and Ulf Schlichtmann. On timing model extraction and hierachical statistical timing analysis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 32(3):367–380, March 2013. |
[38] | Martin Wirnshofer. Variation-Aware Adaptive Voltage Scaling for Digital CMOS Circuits. Dissertation, Technical University of Munich, 2013. |
[39] | Martin Wirnshofer. Variation-Aware Adaptive Voltage Scaling for Digital CMOS Circuits, volume 41. Springer Series in Advanced Microelectronics, 2013. |
[40] | Elisabeth Glocker and Doris Schmitt-Landsiedel. Modeling of Temperature Scenarios in a Multicore Processor System. 11:219–225, 2013. Advances in Radio Science (ARS), Volume 11. [ DOI ] |
[41] | Martin Wirnshofer, Nasim Pour Aryan, Leonhard Heiss, Doris Schmitt-Landsiedel, and Georg Georgakos. On-line supply voltage scaling based on in situ delay monitoring to adapt for PVTA variations. Journal of Circuits, Systems and Computers, 21(08), December 2012. [ DOI ] |
[42] | Bing Li, Ning Chen, and Ulf Schlichtmann. Statistical timing analysis for latch-controlled circuits with reduced iterations and graph transformations. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, pages 1670–1683, November 2012. |
[43] | N. Chen, B. Li, and U. Schlichtmann. Iterative timing analysis based on nonlinear and interdependent flipflop modelling. Circuits, Devices Systems, IET, 6(5):330–337, September 2012. [ DOI ] |
[44] | Dominik Lorenz, Martin Barke, and Ulf Schlichtmann. Efficiently analyzing the impact of aging effects on large integrated circuits. In Microelectronics Reliability, volume 52, pages 1546–1552, August 2012. [ DOI ] |
[45] | Martin Wirnshofer, Leonhard Heiss, A.N.Kakade, Nasim Pour Aryan, Georg Georgakos, and Doris Schmitt-Landsiedel. Adaptive voltage scaling by in-situ delay monitoring for an image processing circuit. In IEEE 15th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS), pages 205–208, April 2012. [ DOI ] |
[46] | Sani R. Nassif, Veit B. Kleeberger, and Ulf Schlichtmann. Goldilocks failures: not too soft, not too hard. In IEEE International Reliability Physics Symposium (IRPS), April 2012. |
[47] | Christoph Knoth, Hela Jedda, and Ulf Schlichtmann. Current source modeling for power and timing analysis at different supply voltages. In Proceedings of Design, Automation and Test in Europe Conference (DATE), pages 923–928, March 2012. [ DOI ] |
[48] | Dominik Lorenz. Aging Analysis of Digital Integrated Circuits. Dissertation, Technical University of Munich, 2012. |
[49] | Christoph Knoth. Accurate Waveform-based Timing Analysis with Systematic Current Source Models. Dissertation, Technical University of Munich, 2012. |
[50] | Shailesh More. Aging Degradation and Countermeasures in Deep-submicrometer Analog and Mixed Signal Integrated Circuits. Dissertation, Technical University of Munich, 2012. |
[51] | Nasim Pour Aryan, Leonhard Heiss, Doris Schmitt-Landsiedel, Georg Georgakos, and Martin Wirnshofer. Comparison of in-situ delay monitors for use in adaptive voltage scaling. Advances in Radio Science (ARS), 10:215–220, 2012. |
[52] | Elisabeth Glocker and Doris Schmitt-Landsiedel. Modeling of Temperature Scenarios in a Multicore Processor System. In Kleinheubacher Tagung 2012, 2012. |
[53] | Martin Wirnshofer, Leonhard Heiss, Georg Georgakos, and Doris Schmitt-Landsiedel. An energy-efficient supply voltage scheme using in-situ pre-error detection for on-the-fly adaptation to PVT variations. In International Symposium on Integrated Circuits (ISIC), pages 94–97, December 2011. [ DOI ] |
[54] | Dominik Lorenz, Martin Barke, and Ulf Schlichtmann. Finding possible critical paths for on-line monitoring of aging in integrated circuits. Technical report, Technische Universität München, December 2011. |
[55] | Christoph Knoth, Carsten Uphoff, Sebastian Kiesel, and Ulf Schlichtmann. SWAT: Simulator for waveform-accurate timing including parameter variations and transistor aging. In International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), volume 6951 of Lecture Notes in Computer Science (LNCS), pages 193–203, September 2011. |
[56] | Ning Chen, Bing Li, and Ulf Schlichtmann. Timing modeling of flipflops considering aging effects. In International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), volume 6951 of Lecture Notes in Computer Science (LNCS), pages 63–72, September 2011. |
[57] | Veit B. Kleeberger and Ulf Schlichtmann. Reliability Analysis of Digital Circuits Considering Intrinsic Noise. In Asia Symposium on Quality Electronic Design (ASQED), July 2011. |
[58] | Martin Wirnshofer, Leonard Heiss, Georg Georgakos, and Doris Schmitt-Landsiedel. A variation-aware adaptive voltage scaling technique based on in-situ delay monitoring. In IEEE 14th International Symposium on Design and Diagnostics of Electronic Circuits & Systems, pages 261–266, 2011. |
[59] | Jürgen Teich, Jörg Henkel, Andreas Herkersdorf, Doris Schmitt-Landsiedel, Wolfgang Schröder-Preikschat, and Gregor Snelting. Invasive computing: An overview. In Michael Hübner and Jürgen Becker, editors, Multiprocessor System-on-Chip – Hardware Design and Tool Integration, pages 241–268. Springer, Berlin, Heidelberg, 2011. [ DOI ] |
[60] | Nasim Pour Aryan, Leonhard Heiss, Doris Schmitt-Landsiedel, Georg Georgakos, and Martin Wirnshofer. Comparison of in-situ delay monitors for use in adaptive voltage scaling. In Kleinheubacher Tagung 2011, 2011. |
[61] | Jürgen Teich. Invasive algorithms and architectures. it - Information Technology, 50(5):300–310, 2008. |