The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The trend of unsustainable power consumption and large memory bandwidth demands in massively parallel multicore systems, with the advent of the big data era, has brought upon the onset of alternate computation paradigms utilizing heterogeneity, specialization, processor-in-memory and approximation. Approximate Computing is being touted as a viable solution for high performance computation by relaxing...
Caches are traditionally organized as a rigid hierarchy, with multiple levels of progressively larger and slower memories. Hierarchy allows a simple, fixed design to benefit a wide range of applications, since working sets settle at the smallest (i.e., fastest and most energy-efficient) level they fit in. However, rigid hierarchies also add overheads, because each level adds latency and energy even...
There has been a growing trend in recent years to outsource various aspects of the semiconductor design and manufacturing flow to different parties spread across the globe. Such outsourcing increases the risk of adversaries adding malicious logic, referred to as hardware Trojans, to the original design. In this paper, we introduce a run-time hardware Trojan detection method for microprocessor cores...
Multicore architectures are increasingly becoming prone to transient faults. In this paper we present Shield, a middleware to provide transactional applications with resiliency to those faults that can happen anytime during the execution of a processor but do not cause any hardware interruption. Shield is inspired by the state machine replication approach, where computational resources are partitioned,...
Today, both the rapid improvement of process technology and the arrival of new embedded systems with highperformance requirements, have led to making the current trend in processors manufacturing shift from single-core processors to multi-core processors. This trend has raised several challenges for reliability in safety-critical systems that operate in high-risk environments, making them more vulnerable...
In today's high performance computing (HPC) environments, analyzing and predicting the performance of multiple-processor systems (clusters cores) on critical workloads remains a challenge. This is as a result of the key metrics that influences system's behavior. Busty arrivals in HPCs demand either a shared memory-parallel architecture or pipelined dataflow architecture. At present, a processor model...
Virtualization technology is well established in the server and desktop spaces, and has been spreading across embedded system market. This technology allows for the coexistence and execution of multiples operating systems on top of the same hardware platform, with proven technological and economic benefits. Hardware extensions for easing virtualization have been added into several commercial off-the-shelf...
Traditionally GPUs focused on streaming, data-parallel applications, with little data reuse or sharing and coarse-grained synchronization. However, the rise of general-purpose GPU (GPGPU) computing has made GPUs desirable for applications with more general sharing patterns and fine-grained synchronization, especially for recent GPUs that have a unified address space and coherent caches. Prior work...
High performance computing systems will need to operate with certain power budgets while maximizing performance in the exascale era. Such systems are built with power aware components, whose collective peak power may exceed the specified power budget. Cluster level power bounded computing addresses this power challenge by coordinating power among components within compute nodes and further adjusting...
The need for faster and more energy efficient computing has led us to the multicore era with distributed shared memory hierarchies. The primary goal is to distribute parallel tasks onto multiple processing elements to collectively achieve shorter execution times at lower frequencies and supply voltages when compared to a single-core architecture. Major challenges of this approach are how to achieve...
Embedded multi-core processors improve performance significantly and are desirable in many application-fields. This in particular includes safety-critical real-time systems, which typically require a deterministic temporal behavior. However, even tasks without dependencies running on different cores can interfere due to, sometimes hidden, shared hardware resources, such as common memories or buses...
In real-time and safety-critical systems, the move towards multicores is becoming unavoidable in order to keep pace with the increasing required processing power and to meet the high integration trend while maintaining a reasonable power consumption. However, the benefit expected from multicore platforms may not step up to the mark, and real-time constraints can be easily violated. Indeed, an efficient...
In embedded systems there is a variant of Multicore System on Chip devices (MSoC devices) where not all the computing elements (processor cores) are equal. The differences in the cores of these devices range from different hardware architectures using the same instruction set to completely different processors working together inside the same device. These SoCs are called “Asymmetric Multi Processing...
We describe our approach to extend the BEAGLE library for high-performance statistical phylogenetic inference (maximum likelihood estimation and Bayesian analysis) in order to support a wider range of modern accelerators and multicore CPUs, and present the corresponding performance results from these platforms. Our solution includes a shared code design providing a uniform interface for a variety...
In this study, we develop a thermal-aware job scheduling strategy called tDispatch tailored for MapReduce applications running on Hadoop clusters. The scheduling idea of tDispatch is motivated by a profiling study of CPU-intensive and I/O-intensive jobs from the perspective of thermal efficiency. More specifically, we investigate the thermal behaviors of these two types of jobs running on a Hadoop...
Power is a critical factor that limits the performance and scalability of modern high performance computer systems. Considering power as a first-order constraint and a scarce system resource, power-bounded computing represents a new perspective to address the power challenge in HPC.In this work we present an application-aware, multi-dimensional power allocation framework to support power-bounded parallel...
A recent development in multicore technology has enabled development of hundreds or thousands core processor. However, on such multicore processor, an efficient hardware cache coherence scheme will become very complex and expensive to develop. This paper proposes a parallelizing compiler directed software coherence scheme for shared memory multicore systems without hardware cache coherence control...
In this paper, we explore the pessimistic voltage guardbands of two multicore x86-64 microprocessor chips that belong to different microarchitectures (one ultra-low power and one high-performance microprocessor), when programs are executed on individual cores of the CPU chips. We also examine the energy and temperature gains as positive effects of lowering the voltage in both chips while preserving...
The architectures of large-scale Internet servers are becoming more complex each year in order to store and process a large amount of Internet data (Big Data) as efficiently as possible. One of the consequences of this continually growing complexity is that individual servers consume a significant amount of data even when they are idle. In this paper we experimentally investigate the scope and usefulness...
Transactional memory (TM) is promising to make parallel programming easier. Many hardware implementations of transactional memory (HTM) have been proposed to improve the performance, but they still suffer from some overheads when a transaction either commits or aborts. So we have been developing a novel new HTM design, called Delayed-Committing TM (DCTM), which enables transactions of arbitrary size...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.