Architecture and Automation for Low-cost Safety-critical Systems
Cost-pressures are driving the vendors of safety-critical systems to integrate previously distributed systems. We developed on-demand redundancy (ODR), a set of architectural techniques that leverage the tightly-coupled nature of components in systems-on-chip to reduce the cost of redundancy in safety-critical systems. ODR eases the assumptions that traditionally segregate the execution of critical and non-critical tasks (NCTs), making resources available for critical tasks at potentially arbitrary points in both space and time, and otherwise freeing resources to execute non-critical tasks when critical tasks are not executing. Our results demonstrate that for a wide variety of applications and architectures, ODR is more cost-effective than (a) a traditional approach that employs dedicated resources executing in lockstep and (b) an alternative low-cost redundancy technique used in industry. To support ODR, we have further adapted technologies for facilitating flexible, low-cost comparison of the execution of redundant threads, and response-time analysis techniques. Early related work is investigating the joint optimization of safety and lifetime in fail-operational systems. Other related work has evaluated the reliability of coarse-grained core arrays.
- Mojing Liu and Brett H. Meyer, “Bounding error detection latency in safety critical systems with enhanced execution fingerprinting,” DFT’16, September 2016.
- Zaid Al-bayati, Jonah Calpan, Brett H. Meyer, Haibo Zeng, “A Four-Mode Model for Efficient Fault-Tolerant Mixed-Criticality Systems,” in the Proceedings of the ACM/IEEE Conference on Design, Automation, and Test in Europe, DATE’16, March 2016.
- Badrun Nahar, Brett H. Meyer, “RotR: Rotational Redundant Task Mapping for Fail-operational MPSoCs,” in the Proceedings of the 28th IEEE Defect and Fault Tolerance in VLSI and Nanotechnology Systems Symposium, DFT’15, October 2015. (Best Paper Award)
- Jonah Caplan, Maria Isabel Mera, Peter Milder, Brett H. Meyer, “Trade-offs in Execution Signature Compression for Reliable Processor Systems,” in the Proceedings of the Conference on Design, Automation, and Test in Europe, DATE’14, March 2014.
- Brett H. Meyer, Benton Calhoun, John Lach, Kevin Skadron, “Cost-effective Safety and Fault Localization using Distributed Temporal Redundancy,” CASES’11, October 2011.
Neural Networks Designing Neural Networks
Artificial neural networks (ANN), historically used in machine learning to solve classification problems, are beginning to appear in a wide variety of applications domains, such as modeling, decision making, data processing, robotics, and control. Conventionally, ANN are designed for classification accuracy, and the more the better, but the resulting systems are often large and expensive, unsuitable for low-cost applications. We have introduced a design approach that uses one ANN to learn the relationship between cost and classification accuracy for a design space of alternative ANN (neural networks designing neural networks). The result is implementations that achieve similar accuracy to the start of the art at a fraction of the cost.
Our framework, OPAL (Ordinary People Accelerating Learning), is now available!
Internet Packet Classification
ISPs and network administrators use Internet Packet Classification (IPC) to categorize packets into flows (traffic sharing IP addresses, ports, and protocol), and thereby the application classes generating them. Distinguishing between safe and malicious traffic aids in network intrusion interception. Likewise, categorizing applications into classes is useful for traffic management for better service. Traditional IPC based on port numbers and payload pattern recognition are no longer effective because current applications can dynamically change port numbers and cipher their contents. Recent machine learning (ML) IPC solutions have speed-bounded accuracy, and complex implementation due to their dependence on packet sizes and order of arrival. We propose a new IPC approach that uses associative memory (AM) based on sparse-clustered network with selective decoding, dramatically reducing the memory for hardware implementation of IPC, while significantly improving classification throughput.
- Sean C. Smithson, Guang Yang, Warren J. Gross, and Brett H. Meyer, “Neural networks designing neural networks: Multi-objective hyper-parameter optimization,” ICCAD’16, November 2016.
- Scott D. Dagondon, Warren J. Gross, and Brett H. Meyer, “Sparse-clustered network with selective decoding for internet traffic classification,” SiPS’16, October 2016.
- Sean C. Smithson, Kaushik Boga, Arash Ardakani, Brett H. Meyer, and Warren J. Gross, “SS-stochastic: Stochastic computing can improve upon digital spiking neural networks,” SiPS’16, October 2016. (Invited paper)
Lifetime and Yield Optimization
Design-space Exploration for Embedded MPSoCs
Embedded system designers rely on automation approaches to meet time-to-market constraints, but to date few tools have been developed to assist with optimizing system cost, lifetime. To this end, we have developed (a) techniques to accelerate lifetime estimation and (b) lifetime-aware design space exploration. Earlier, we developed a system synthesis approach that, given an application, hardware/software partitioning and communication architecture, selects and organizes system resources, allocating and distributing slack to jointly optimize system cost and lifetime. Additional research has explored the effect of task mapping on lifetime in this context. Recently, we have developed novel models for abstracting behavior observed in atomistic models of negative-bias temperature instability to the level of standard cell libraries. Other work explore a new approaches to quickly estimating system lifetime using multi-armed bandits.
Yield Improvement for Parallel Architectures
Recent research as suggested that as more processor cores are incorporated on single chips, the appropriate granularity of redundancy for the purpose of failure and defect mitigation is at the system-level. We leverage this fact above, but have observed that some systems benefit from a combination of system-level and microarchitetural redundancy. In this project, we investigate the relationship between parallel application, parallel architecture (and single-instruction, multiple-thread architectures in particular), and redundancy allocation, based on the observation that as the demand for types of parallel resources changes (e.g., from many narrow cores to few wide cores), so ought the mix of redundant components (e.g., from redundant cores to cores with redundant lanes).
- Calvin Ma, Aditya Mahajan, Brett H. Meyer, “Multi-Armed Bandits for Efficient Lifetime Estimation in MPSoC design,” DATE’17, March 2017.
- S. Hasan Mozafari and Brett H. Meyer, “Efficient performance evaluation of multi-core SIMT processors with hot redundancy,” IEEE Transactions on Emerging Topics in Computing (TETC), pp. 1–12, July 2016. (In press)
- Dimitrios Stamoulis, Simone Corbetta, Dimitrios Rodopoulos, Pieter Weckx, Peter Debacker, Brett H. Meyer, Ben Kaczer, Praveen Raghavan, Dimitrios Soudris, Francky Catthoor, and Zeljko Zilic, “Capturing true workload dependency of BTI- induced degradation in CPU components,” GLSVLSI’16, May 2016.
- Brett H. Meyer, Adam S. Hartman, Donald E. Thomas, “Cost-effective Lifetime and Yield Optimization for NoC-based MPSoCs,” in ACM Transactions on Design Automation of Electronic Systems (TODAES), 19(2), April 2014.
- Adam S. Hartman, Donald E. Thomas, Brett H. Meyer, “A Case for Lifetime-aware Task Mapping in Embedded Chip Multiprocessors,” CODES+ISSS’10, October 2010.
VoltSpot: Power-delivery Network Modeling and Optimization
In future CMOS technology nodes, threshold and supply voltages are not scaling down as fast as device density is increasing. Higher current density and total current place greater demands on the power-delivery network (PDN); current-related chip phenomena such as electromigration (EM), resistive current (IR) drop, and inductive transient current (Ldi/dt) noise all get worse with higher current and larger current swings. VoltSpot is an architecture-level model of the on-chip PDN including C4 pads, with a simple interface for use in other architecture-level tools. VoltSpot, when integrated with a performance simulator (such as gem5) and power estimation tool (such as McPAT), provides architects with the tools necessary to explore the effect of PDN design, including C4 pad allocation to VDD, GND and I/O and PDN metal width. VoltSpot also supports the exploration of run-time IR drop and Ldi/dt noise prediction, avoidance, and mitigation. Recent work has begun to explore design techniques for system lifetime, 3D-ICs and simulation techniques for accelerating the process of solving for on-chip voltage noise.
- Marco T. Kassis, Yaswanth R. Akaveeti, Brett H. Meyer, and Roni Khazaka, “Parallel transient simulation of power delivery networks using model order reduction,” EPEPS’15, October 2016.
- Runjie Zhang, Brett H. Meyer, Ke Wang, Mircea R. Stan, Kevin Skadron, “Tolerating the Consequences of Multiple EM-induced C4 Bump Failures,” IEEE Transactions on VLSI (TVLSI), 24(6), June 2016.
- Runjie Zhang, Kaushik Mazumdar, Brett H. Meyer, Ke Wang, Kevin Skadron, Mircea Stan, “Transient Voltage Noise in Charge-Recycled Power Delivery Networks for Many-Layer 3D-IC,” in the Proceedings of the International Symposium on Low Power Electronics Design, ISLPED’15, July 2015.
- Runjie Zhang, Ke Wang, Brett H. Meyer, Mircea Stan, Kevin Skadron, “Architecture Implications of Pads as a Scarce Resource,” in the Proceedings of the ACM/IEEE International Symposium on Computer Architecture, ISCA’14, June 2014.
- Ke Wang, Brett H. Meyer, Runjie Zhang, Kevin Skadron, Mircea Stan, “Managing C4 Placement for Transient Voltage Noise Minimization,” in the Proceedings of the Design Automation Conference, DAC’14, June 2014.
ArchFP: Architectural Floorplanning for Early Design Analysis
ArchFP is a simple, easy to use, architect-directed floorplanning tool. Floorplanning tools grew out of a need to automate the placement of standard cells and module in large, complex designs. Floorplans are often needed in order to estimate design area, performance, power, temperature, and therefore reliability. System architects need a way to generate floorplans for the same reason but system-level floorplans, which often consist of only a handful of blocks, often placed in some regular way (e.g., tiled cores), are poorly explored by tools designed to manage the complexity of thousands of blocks. ArchFP gives a system architect a tool that leverages their knowledge of the system and quickly produces a floorplan that can be used for further analysis.
- Gregory G. Faust, Runjie Zhang, Kevin Skadron, Mircea R. Stan and Brett H. Meyer, “ArchFP: Rapid Prototyping of pre-RTL Floorplans,” VLSI-SOC’12, October 2012.