Advanced search

chapter

Fault-tolerant iterative solvers with adaptive reliability

Aaditya Shukla, Yue Wu, Saman Zonouz, Maryam Mehri Dehnavi

2016 IEEE Conference on Electromagnetic Field Computation (CEFC) > 1

2016 IEEE Conference on Electromagnetic Field Computation (CEFC)

Soft errors are increasing in modern computers. These faults can corrupt the results of scientific simulations. This work studies error propagation by a bit flip in conjugate gradient (CG) methods. We will also introduce adaptivity to selective reliable fault-tolerant (SRFT) solvers. Our study reduces the compute-intensive reliability steps in SRFT solvers.

chapter

A Fault-Tolerant Java Virtual Machine Using Fast Rejuvenation for Soft-Error-Prone Systems

Qi Ao, Longbing Zhang, Shuai Chen, Jie Fu, more

2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems > 463 - 469

2015 IEEE 17th International Conference on High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS) and 2015 IEEE 12th International Conf on Embedded Software and Systems (ICESS)

To achieve better performance, computer designers employ advanced techniques that shrink feature sizes, lower supply voltage, increase clock rates and memory capacity, and meanwhile modern computers become increasingly vulnerable to soft errors caused by energetic particles, such as alpha particles and neutron strikes. Therefore, fault tolerance evolves into one of the most significant design objectives,...

chapter

Resilient Virtual Clusters

Michael Le, Israel Hsu, Yuval Tamir

2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing > 214 - 223

2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing (PRDC)

Clusters of computers can provide, in aggregate, reliable services despite the failure of individual computers. System-level virtualization is widely used to consolidate the workload of multiple physical systems as multiple virtual machines (VMs) on a single physical computer. A single physical computer thus forms a \fIvirtual cluster\fP of VMs. A key difficulty with virtualization is that the failure...

chapter

An Autonomous Decentralized Public Key Infrastructure

L C Coronado-García, C Pérez-Leguízamo

2011 Tenth International Symposium on Autonomous Decentralized Systems > 409 - 414

2011 Tenth International Symposium on Autonomous Decentralized Systems (ISADS)

A main goal of a Public Key Infrastructure (PKI) is the management of digital certificates in order to bind public keys with respective user identities assuring the uniqueness of these public keys. A PKI must guarantee the reliability of its services, assuring the timeliness of its responses and the continuity of the service despite of the growth in the number of users and the presence of hardware...

chapter

Research and Implementation of Fault-Tolerant Computer Interlocking System

Chen GuangWu, Fan DuoWang, Yang JuHua

2010 International Conference on Computational Intelligence and Software Engineering > 1 - 4

2010 International Conference on Computational Intelligence and Software Engineering (CiSE 2010)

A new signal control system for railway stations, fault-tolerant all- electronic computer interlocking control system, is proposed,in which the computer-based interlocking system layer is constituted through the implementation of electronic security unit replacing the Relay, and the all-electronic fault-tolerant controlling for whole system is fulfilled through two of three fault-tolerant computer...

chapter

An Approach for Evaluating Availability and Performability of Data Processing Center in the Internet of Things Environment

Yi Feng, Zhan Zhang, Hao Liu, De-Cheng Zuo, more

2010 First International Conference on Pervasive Computing, Signal Processing and Applications > 1035 - 1038

2010 First International Conference on Pervasive Computing, Signal Processing and Applications (PCSPA 2010)

With the developing of pervasive computing technology especially the appearance of the internet of things, there will generate more tremendous real-time or OLTP information that requires the data center with high availability and performability. This paper proposes a hierarchical approach to model and evaluate the availability and performability of the transaction processing data center using throughout...

chapter

Topological Properties of a New Fault Tolerant Interconnection Network for Parallel Computer

S.P. Mohanty, B.N.B. Ray, S.N. Patro, A.R. Tripathy

2008 International Conference on Information Technology > 36 - 40

2008 International Conference on Information Technology

In this paper we introduce a new interconnection network, the extended varietal hypercube with cross connection denoted by EVHC(n,k). This network has hierarchical structure and it overcomes the poor fault tolerant properties of extended varietal hypercube. This network has low diameter, constant degree connectivity and low message traffic density.

chapter

FPGA On-Board Computer design based on hierarchical fault tolerance

Lei Xing, Zhaowei Sun, Guodong Xu

2008 2nd International Symposium on Systems and Control in Aerospace and Astronautics > 1 - 5

ISSCAA 2008. 2nd International Symposium on Systems and Control in Aerospace and Astronautics

Safety is a crucial requirement of On-Board Computer (OBC) design of a satellite, especially for the new type OBC--takes FPGA as central processor. Upon that this paper proposes a plan of FPGA OBC design and adds hierarchical fault tolerant concept to enhance the reliability of the OBC system. The fault tolerant architecture can be divided into three hierarchic ranks, containing single-CPU reconfiguration,...

chapter

Redefining terms related to dependability

G. Kemnitz, H.A. Ramadan, C. Giesemann

2008 11th International Biennial Baltic Electronics Conference > 195 - 198

2008 International Biennial Baltic Electronics Conference

Dependability is an integrative concept that encompasses the following attributes: availability (readiness for correct service), reliability (continuity of correct service) and safety (absence of catastrophic consequences for the user(s) and the environment). In this paper we redefine these attributes. We are looking at them not only as concepts but as quantities. That makes it possible to measure...

chapter

Dynamic Testing of an SRAM-Based FPGA by Time-Resolved Laser Fault Injection

V. Pouget, A. Douin, G. Foucard, P. Peronnard, more

2008 14th IEEE International On-Line Testing Symposium > 295 - 301

14th IEEE International On-Line Testing Symposium

This paper presents principles and results of dynamic testing of an SRAM-based FPGA using time- resolved fault injection with a pulsed laser. The synchronization setup and experimental procedure are detailed. Fault injection results obtained with a DES crypto-core application implemented on a Xilinx Virtex II are discussed.

chapter

Verification and Analysis of Self-Checking Properties through ATPG

M. Hunger, S. Hellebrand

2008 14th IEEE International On-Line Testing Symposium > 25 - 30

14th IEEE International On-Line Testing Symposium

Present and future semiconductor technologies are characterized by increasing parameters variations as well as an increasing susceptibility to external disturbances. Transient errors during system operation are no longer restricted to memories but also affect random logic, and a robust design becomes mandatory to ensure a reliable system operation. Self-checking circuits rely on redundancy to detect...

chapter

Budget-Dependent Control-Flow Error Detection

R. Vemu, J.A. Abraham

2008 14th IEEE International On-Line Testing Symposium > 73 - 78

14th IEEE International On-Line Testing Symposium

The problem of detection of control flow errors in software has been studied extensively in literature and many detection techniques have been proposed. These techniques typically have high memory and performance overheads and hence are unusable for real-time embedded systems which have tight memory and performance budgets. This paper presents two algorithms by which the overheads associated with...

chapter

A BISR Architecture for Embedded Memories

K. Pekmestzi, N. Axelos, I. Sideris, N. Moshopoulos

2008 14th IEEE International On-Line Testing Symposium > 149 - 154

14th IEEE International On-Line Testing Symposium

In this paper a BISR architecture for embedded memories is presented. The proposed scheme utilises a multiple bank cache-like memory for repairs. Statistical analysis is used for minimisation of the total resources required to achieve a very high fault coverage. Simulation results show that the proposed BISR scheme is characterised by high efficiency and low area overhead, even for high defect densities...

chapter

Reliability in Application Specific Mesh-Based NoC Architectures

F. Refan, H. Alemzadeh, S. Safari, P. Prinetto, more

2008 14th IEEE International On-Line Testing Symposium > 207 - 212

14th IEEE International On-Line Testing Symposium

Networks on chips (NoCs) provide a mechanism for handling complex communications in the next generation of integrated circuits. At the same time, lower yield in nano-technology, makes self repair communication channels a necessity in design of digital systems. This paper proposes a reliable NoC architecture based on specific application mapped onto an NoC. This architecture is capable of recovering...

chapter

A Fault-Tolerant Attitude Determination System Based on COTS Devices

R.O. Duarte, L.S. Martins-Filho, G.F.T. Knop, R.S. Prado

2008 14th IEEE International On-Line Testing Symposium > 85 - 90

14th IEEE International On-Line Testing Symposium

In this paper we present a low cost fault-tolerant attitude determination system to a scientific satellite using COTS devices. We related our experience in developing the attitude determination system, where we combine proven fault tolerance techniques to protect the whole system composed only by COTS from the effects produced by transient faults. We detailed the failure cases and the detection, reconfiguration...

chapter

Responsive Fault-Tolerant Computing in the Era of Terascale Integration State of Art Report

P. Ezhilchelvan

2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC) > 492 - 496

11th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing (ISORC '08)

Scaling in hardware integration process results in IC-process geometry reductions, lower operating voltages and increased clock speeds. This paper first surveys the reliability obstacles these developments give rise to and then points out that computing systems can no longer be safely assumed to fail only by crashing. Yet this assumption is at the core of primary-backup replication which the literature...

chapter

Introduction to the Proceedings of the EDOC 2007 Workshop Middleware for Web Services (MWS) 2007

V. Tosic, K.M. Goschka, A. van Moorsel, R. Wong

2007 Eleventh International IEEE EDOC Conference Workshop > 69 - 72

2007 11th IEEE International Enterprise Distributed Object Computing Conference Workshops (EDOC Workshops)

This is an introduction to the proceedings of the MWS 2007 workshop held at EDOC 2007. It first explains the motivation for and background of the workshop. Then, it contains a short description of the keynote, each long and short peer-reviewed paper, and the discussion session "impact of various execution environments on middleware for web services". After the closing statements, MWS 2007...

chapter

BFT-WS: A Byzantine Fault Tolerance Framework for Web Services

Wenbing Zhao

2007 Eleventh International IEEE EDOC Conference Workshop > 89 - 96

2007 11th IEEE International Enterprise Distributed Object Computing Conference Workshops (EDOC Workshops)

Many Web services are expected to run with high degree of security and dependability. To achieve this goal, it is essential to use a Web-services compatible framework that tolerates not only crash faults, but Byzantine faults as well, due to the untrusted communication environment in which the Web services operate. In this paper, we describe the design and implementation of such a framework, called...

chapter

Research on Triple Modular Redundancy Dynamic Fault-Tolerant System Model

Zhe Zhang, Daxin Liu, Zhengxian Wei, Changsong Sun

First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS'6) > 1 > 572 - 576

First International on Computer and Computational Sciences

With the increasing amount of distributed computing systems applied in wide range of critical domains, the requirement of high reliability and high availability of distributed computing systems tend to more and more urgent, that the study of distributed fault-tolerant system become more significance. This paper provides a model of triple-modular redundancy dynamic fault-tolerant system, and the reliability...

article

Multiple error correction and additive overflow detection with magnitude indices in residue code

Saroj Kaushik

01985 IEEE 00007th Symposium on Computer Arithmetic (ARITH) > 1985 > 278 - 284

1985 IEEE 7th Symposium on Computer Arithmetic (ARITH)

A new approach for correcting multiple errors and detecting an additive overflow in the Residue Number System (RNS) is suggested. It works with the code whose redundancy is in the form of magnitude indices. The residue representation of a number with magnitude index is reconsidered. The RNS with magnitude index were first studied by Sasaki¹⁶ and Rao¹⁵ and then by Barsi and Maestrini^{5, 6}. The range...

INFONA - science communication portal

Advanced search

Advanced search

Fault-tolerant iterative solvers with adaptive reliability

A Fault-Tolerant Java Virtual Machine Using Fast Rejuvenation for Soft-Error-Prone Systems

Resilient Virtual Clusters

An Autonomous Decentralized Public Key Infrastructure

Research and Implementation of Fault-Tolerant Computer Interlocking System

An Approach for Evaluating Availability and Performability of Data Processing Center in the Internet of Things Environment

Topological Properties of a New Fault Tolerant Interconnection Network for Parallel Computer

FPGA On-Board Computer design based on hierarchical fault tolerance

Redefining terms related to dependability

Dynamic Testing of an SRAM-Based FPGA by Time-Resolved Laser Fault Injection

Verification and Analysis of Self-Checking Properties through ATPG

Budget-Dependent Control-Flow Error Detection

A BISR Architecture for Embedded Memories

Reliability in Application Specific Mesh-Based NoC Architectures

A Fault-Tolerant Attitude Determination System Based on COTS Devices

Responsive Fault-Tolerant Computing in the Era of Terascale Integration State of Art Report

Introduction to the Proceedings of the EDOC 2007 Workshop Middleware for Web Services (MWS) 2007

BFT-WS: A Byzantine Fault Tolerance Framework for Web Services

Research on Triple Modular Redundancy Dynamic Fault-Tolerant System Model

Multiple error correction and additive overflow detection with magnitude indices in residue code

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options