The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming...
Flash memory-based storages are used in a wide range of systems from small mobile devices to large-scale system servers. The performance demand from applications and the technology of flash memory vary widely from one system to another, making it difficult to design a universal flash memory scheduler for all systems. In this paper, we present a framework for efficient and flexible flash memory scheduling...
A central processing unit (CPU) and peripheral devices are discussed for which all data processing and data transfer is uniquely time tagged using a timestamp generated by the embedded processing system master clock. The Time Aware Processor (TAP) introduces time into the processor computing language to relate data to temporal events, including the processors own internal functions.
Reliability evaluation is a critical task in computing systems. From one side, the results must be accurate enough not to under-or over-estimate the overall system reliability (thus either resulting in a non-reliable system, or a system for which too expensive solutions have been adopted). On the other side, the time required for the analysis should be kept at the minimum. This paper presents some...
This paper discourse about a low area elliptical curve cryptographic processor with high performance is implemented. The architecture proposed comprises of a full precision multiplier with two staged segmented pipelining to lessen the clock cycles used by avoiding data dependency. The multiplier uses a modified Montgomery algorithm for point multiplication. The simulation results are obtained using...
Recent activity in near-data processing has built or proposed systems that can exploit technologies such as 3D stacks, in-situ computing, or dataflow devices. However, little effort has been applied to exploit the natural parallelism and throughput of DRAM. This article details research from Micron Technology in the area of processing in memory as a form of memory-centric computing. In-Memory Intelligence...
Pairing-based cryptography has got a lot of attention the last years, since the proposition of the tripartite key exchange. The best type of pairing is optimal ate pairing over Barreto-Naehrig curves which are based on two steps: Miller Loop and final exponentiation. Most of the researches were done for the Miller Loop. In this paper, we present the different methods for computing the hard part of...
A brief review of Protected Execution Mode (PEM) for user-space applications featured in Elbrus architecture is described first. Then, AddressSanitizer, a well-known utility by Google Inc, is considered as an example of a pure software technique of memory control. Comparative analysis of these solutions is given with performance flaws, applicability and boundary violation detection quality.
The paper proposes an approach to instruction stream generation for verification of microprocessor designs. The approach is based on using formal specifications of the instruction set architecture as a source of knowledge about the design under verification. This knowledge is processed with generic engines implementing an extensible set of generation strategies to produce stimuli in the form of instruction...
HW/SW co-designed processors currently have a renewed interest due to their capability to boost performance without running into the power and complexity walls. By employing a software layer that performs dynamic binary translation and applies aggressive optimizations through exploiting the runtime application behavior, these hybrid architectures provide better performance/watt. However, a poorly...
Hardware companies conduct extensive testing and verification during the processor design process to reduce the number of errata that persist to the final product. These processes rely on a specification against which to test or verify the design; as a result, they will fail to catch vulnerabilities stemming from errors in the specification itself. In this work we present a model-checking based approach...
The main goal of this paper is to expose the community to past achievements and future possible uses of Instruction Set Extension (ISE) in security applications. Processor customization has proven to be an effective way for achieving high performance with limited area and energy overhead for several applications, ranging from signal processing to graphical computation. Concerning cryptographic algorithms,...
Developing new methods to evaluate the software reliability in an early design stage of the system can save the design costs and efforts, and will positively impact product time-to-market. This paper introduces a new approach to evaluate, at early design phase, the reliability of a computing system running a software. The approach can be used when the hardware architecture is not completely defined...
Current ultra-high-performance computers execute instructions at the rate of roughly 10 PFLOPS (10 quadrillion floating-point operations per second) and dissipate power in the range of 10 MW. The next generation will need to execute instructions at EFLOPS rates—100$\times$ <alternatives><inline-graphic xlink:type="simple" xlink:href="jacob-ieq1-2424699.gif"/></alternatives>...
This paper presents four different architectures for the hardware acceleration of axis-parallel, oblique and non-linear decision tree ensemble classifier systems. Hardware architectures for the implementation of a number of ensemble combination rules are also presented. The proposed architectures are optimized for size, making them particularly interesting for embedded applications where the size...
Conventional oscilloscope software is developed in a “top-to-down” design mode, namely, the overall software architecture is designed firstly. In such a mode, the software modules are divided from the operation and display level, not corresponding to hardware modules. Therefore, the software cannot be implanted together with hardware modules during implantation. In this Paper, the author puts forward...
Heterogeneous dynamic computing platforms are one of the big trends in today's electronic world. These platforms typically feature different General-Purpose-Processors (GPP) combined with accelerators on a reconfigurable layer. However, this necessitates specialized programming models and an Operating System (OS) for dealing with the dynamicity. To allow the early development of the system software,...
Traditional microprocessors have long benefited from the transistor density gains of Moore's law. Diminishing transistor speeds and practical energy limits however have created new challenges in technology, where the exponential performance improvements we have been accustomed to from previous computing generations continue to slowly cease. These factors signify that while transistors continue to...
We describe a tool and methodology for extracting short and effective functional tests from long running commercial programs and manufacturing system tests for testing microprocessors and SOCs. The tool combines fast Instruction Set Architecture (ISA) simulator and Design for Test (DFT) capabilities of the microprocessor to enable tracing of long running workloads. The trace is then converted into...
Data-parallel architectures must provide efficient support for complex control-flow constructs to support sophisticated applications coded in modern single-program multiple-data languages. As these architectures have wide data paths that process a single instruction across parallel threads, a mechanism is needed to track and sequence threads as they traverse potentially divergent control paths through...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.