The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
CPU-FPGA heterogeneous platforms offer a promising solution for high-performance and energy-efficient computing systems by providing specialized accelerators with post-silicon reconfigurability. To unleash the power of FPGA, however, the programmability gap has to be filled so that applications specified in high-level programming languages can be efficiently mapped and scheduled on FPGA. The above...
The deceleration of transistor feature size scaling has motivated growing adoption of specialized accelerators implemented as GPUs, FPGAs, ASICs, and more recently new types of computing such as neuromorphic, bio-inspired, ultra low energy, reversible, stochastic, optical, quantum, combinations, and others unforeseen. There is a tension between specialization and generalization, with the current state...
The stencil pattern is important in many scientific and engineering domains, spurring great interest from researchers and industry. In recent years, various optimizations have been proposed for parallel stencil applications running on GPUs. However, most of the runtime systems that execute those applications often fail to fully utilize the parallelism of modern heterogeneous systems. In this paper,...
To improve the effective utilisation of its supercomputing platforms, the New Zealand eScience Infrastructure (NeSI) offers, in addition to user support and the installation of a comprehensive software stack, a consultancy service to some of its users. Here we present lessons learned from this work and how additional improvements can be made to further enhance productivity of researchers on computing...
This paper presents tools the author uses to enhance student motivation in a microcontroller-based embedded systems course. The course is offered as part of a requirement in a computer engineering degree program. In the traditional lecture-based teaching and learning process, information typically flows in one direction with very little active involvement by the students. There are a number of techniques...
Research on a new solution supporting low latency, high reliability, and scalability is required to deal with ever-increasing demand for wireless communication. However, it is often restricted due to the fact that setting up experiments needs a considerable amount of effort and cost. To alleviate such difficulties, the WiSHFUL project has been established, which proposes an architecture for flexible...
Programming Micron's Automata Processor (AP) requires expertise in both automata theory and the AP architecture, as programmers have to manually manipulate state transition elements (STEs) and their transitions with a low-level Automata Network Markup Language (ANML). When the required STEs of an application exceed the hardware capacity, multiple reconfigurations are needed. However, most previous...
This design provides a kind of double CPU communication between solution, using PROTUES software structures, basic simulation circuit, the load KEIL software written in HEX file, USES the RS232 interface standard, simple circuit structure, solve the problem of the complicated parallel communication lines, applies to close range, low rate of communication occasions.
This WIP exposes the design and implementation of Midroid, an open mobile platform focused on microcontrollers' education. The platform has been designed with the aim to address both the problems detected as the educational gaps associated to the acquisition of the algorithmic thinking needed in the conceptualization and structuring of a program or algorithm determined. To technical level, the platform...
Today's dominant hardware description languages (HDLs), namely Verilog and VHDL, tightly couple design functionality with timing requirements and target device constraints. As hardware designs and device architectures became increasingly more complex, these dominant HDLs yield verbose and unportable code. To raise the level of abstraction, several high-level synthesis (HLS) tools were introduced,...
Field-Programmable Gate Arrays (FPGAs) are gaining considerable momentum in mainstream high-performance systems in recent years due to their flexibility and low power consumption. Still, FPGAs remain largely unavailable to software programmers due to programming and debugging difficulties that are inherent to standard Hardware Description Languages. The performance that hardware-oblivious software...
This paper deals with the evaluation of FPGAs resurgence for hardware acceleration applied to computed tomography on the back-projection operator used in iterative reconstruction algorithms. We focus our attention on the tools developed by FPGAs manufacturers, in particular the Intel FPGA SDK for OpenCL, that promises a new level of hardware abstraction from the developer's perspective, allowing a...
Thread-Level Speculation (TLS) mechanism has been extensively studied due to its capability of simplifying parallel programming and achieving effective performance speedup. In this paper, we investigate the study of improving current TLS models for high efficiency on present multi-core architectures. Particularly, we propose a new TLS model called Cache Copy-on-Write (CCoW). The main features of our...
Temporal predictability is a crucial requirement for hard real-time applications. Thus, deterministic software execution flows are commonly aspired to achieve that requirement. However, as an apparently unavoidable contradiction to this approach in today's embedded systems, both IRQs and concurrently running tasks are also required to react to dynamic environments and to allow the modular composition...
A library design is presented aimed to standardize C language programming on microcontroller-based platforms. This library defines a simplified Application Programming Interface (API) that abstracts the most common modes of use of typical peripherals found in current microcontrollers in the marketplace. In this way, it becomes possible to program them with no need of knowing details on the underlying...
Transactional Memory (TM) promises both to provide a scalable mechanism for synchronization in concurrent programs, and to offer ease-of-use benefits to programmers. The most straightforward use of TM in real-world programs is in the form of Transactional Lock Elision (TLE). In TLE, critical sections are attempted as transactions, with a fall-back to a lock if conflicts manifest. Thus TLE expects...
The home-grown SW26010 many-core processor enabled the production of China’s first independently developed number-one ranked supercomputer – the Sunway TaihuLight. The design of the limited off-chip memory bandwidth, however, renders the SW26010 a highly memory-bound processor. To compensate for this limitation, the processor was designed with a unique hardware feature, "Register Level Communication"...
This paper presents an overview of a storage system software stack designed specifically for extreme scale data centric computing. The storage system software stack is designed to work for next generation storage system architectures that will need to incorporate multiple data storage device technologies. We also envision such data centric storage systems to have in-storage compute capability to make...
Heterogeneous computing platforms containing a wide range of computing resources from CPUs to specialized hardware accelerators is the trend today resulting from the physical limitations on processors speed and the increasing demand for computing performance. Hence many optimization strategies are studied to get better throughput and lower energy consumption in heterogeneous systems. Various memory...
These two issues are addressed in this paper: 1) the formal definitions of the concepts relevant to program faults, and 2) the comparison and classification of program faulttolerant abilities. We firstly analyze the subtle differences among these basic concepts: faults, errors and failures, and represent their formal definitions by using the state-based theory of program behavior; and then we propose...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.