The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Soft errors are increasing in modern computers. These faults can corrupt the results of scientific simulations. This work studies error propagation by a bit flip in conjugate gradient (CG) methods. We will also introduce adaptivity to selective reliable fault-tolerant (SRFT) solvers. Our study reduces the compute-intensive reliability steps in SRFT solvers.
To achieve better performance, computer designers employ advanced techniques that shrink feature sizes, lower supply voltage, increase clock rates and memory capacity, and meanwhile modern computers become increasingly vulnerable to soft errors caused by energetic particles, such as alpha particles and neutron strikes. Therefore, fault tolerance evolves into one of the most significant design objectives,...
Clusters of computers can provide, in aggregate, reliable services despite the failure of individual computers. System-level virtualization is widely used to consolidate the workload of multiple physical systems as multiple virtual machines (VMs) on a single physical computer. A single physical computer thus forms a \fIvirtual cluster\fP of VMs. A key difficulty with virtualization is that the failure...
A main goal of a Public Key Infrastructure (PKI) is the management of digital certificates in order to bind public keys with respective user identities assuring the uniqueness of these public keys. A PKI must guarantee the reliability of its services, assuring the timeliness of its responses and the continuity of the service despite of the growth in the number of users and the presence of hardware...
A new signal control system for railway stations, fault-tolerant all- electronic computer interlocking control system, is proposed,in which the computer-based interlocking system layer is constituted through the implementation of electronic security unit replacing the Relay, and the all-electronic fault-tolerant controlling for whole system is fulfilled through two of three fault-tolerant computer...
With the developing of pervasive computing technology especially the appearance of the internet of things, there will generate more tremendous real-time or OLTP information that requires the data center with high availability and performability. This paper proposes a hierarchical approach to model and evaluate the availability and performability of the transaction processing data center using throughout...
In this paper we introduce a new interconnection network, the extended varietal hypercube with cross connection denoted by EVHC(n,k). This network has hierarchical structure and it overcomes the poor fault tolerant properties of extended varietal hypercube. This network has low diameter, constant degree connectivity and low message traffic density.
Safety is a crucial requirement of On-Board Computer (OBC) design of a satellite, especially for the new type OBC--takes FPGA as central processor. Upon that this paper proposes a plan of FPGA OBC design and adds hierarchical fault tolerant concept to enhance the reliability of the OBC system. The fault tolerant architecture can be divided into three hierarchic ranks, containing single-CPU reconfiguration,...
Dependability is an integrative concept that encompasses the following attributes: availability (readiness for correct service), reliability (continuity of correct service) and safety (absence of catastrophic consequences for the user(s) and the environment). In this paper we redefine these attributes. We are looking at them not only as concepts but as quantities. That makes it possible to measure...
This paper presents principles and results of dynamic testing of an SRAM-based FPGA using time- resolved fault injection with a pulsed laser. The synchronization setup and experimental procedure are detailed. Fault injection results obtained with a DES crypto-core application implemented on a Xilinx Virtex II are discussed.
Present and future semiconductor technologies are characterized by increasing parameters variations as well as an increasing susceptibility to external disturbances. Transient errors during system operation are no longer restricted to memories but also affect random logic, and a robust design becomes mandatory to ensure a reliable system operation. Self-checking circuits rely on redundancy to detect...
The problem of detection of control flow errors in software has been studied extensively in literature and many detection techniques have been proposed. These techniques typically have high memory and performance overheads and hence are unusable for real-time embedded systems which have tight memory and performance budgets. This paper presents two algorithms by which the overheads associated with...
In this paper a BISR architecture for embedded memories is presented. The proposed scheme utilises a multiple bank cache-like memory for repairs. Statistical analysis is used for minimisation of the total resources required to achieve a very high fault coverage. Simulation results show that the proposed BISR scheme is characterised by high efficiency and low area overhead, even for high defect densities...
Networks on chips (NoCs) provide a mechanism for handling complex communications in the next generation of integrated circuits. At the same time, lower yield in nano-technology, makes self repair communication channels a necessity in design of digital systems. This paper proposes a reliable NoC architecture based on specific application mapped onto an NoC. This architecture is capable of recovering...
In this paper we present a low cost fault-tolerant attitude determination system to a scientific satellite using COTS devices. We related our experience in developing the attitude determination system, where we combine proven fault tolerance techniques to protect the whole system composed only by COTS from the effects produced by transient faults. We detailed the failure cases and the detection, reconfiguration...
Scaling in hardware integration process results in IC-process geometry reductions, lower operating voltages and increased clock speeds. This paper first surveys the reliability obstacles these developments give rise to and then points out that computing systems can no longer be safely assumed to fail only by crashing. Yet this assumption is at the core of primary-backup replication which the literature...
This is an introduction to the proceedings of the MWS 2007 workshop held at EDOC 2007. It first explains the motivation for and background of the workshop. Then, it contains a short description of the keynote, each long and short peer-reviewed paper, and the discussion session "impact of various execution environments on middleware for web services". After the closing statements, MWS 2007...
Many Web services are expected to run with high degree of security and dependability. To achieve this goal, it is essential to use a Web-services compatible framework that tolerates not only crash faults, but Byzantine faults as well, due to the untrusted communication environment in which the Web services operate. In this paper, we describe the design and implementation of such a framework, called...
With the increasing amount of distributed computing systems applied in wide range of critical domains, the requirement of high reliability and high availability of distributed computing systems tend to more and more urgent, that the study of distributed fault-tolerant system become more significance. This paper provides a model of triple-modular redundancy dynamic fault-tolerant system, and the reliability...
A new approach for correcting multiple errors and detecting an additive overflow in the Residue Number System (RNS) is suggested. It works with the code whose redundancy is in the form of magnitude indices. The residue representation of a number with magnitude index is reconsidered. The RNS with magnitude index were first studied by Sasaki16 and Rao15 and then by Barsi and Maestrini5, 6. The range...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.