The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Hardware designers and engineers typically need to explore a multi-parametric design space in order to find the best configuration for their designs using simulations that can take weeks to months to complete. For example, designers of special purpose chips need to explore parameters such as the optimal bit width and data representation. This is the case for the development of complex algorithms such...
Next generation video standards have strict and increasing performance demands due to real-time requirements and the trend towards higher frame resolutions and bit rates. Leveraging the advantages of reconfigurable logic and emerging multi-core processor architectures to exploit all levels of parallelism of such workloads is necessary to achieve real time functionality at a reasonable cost.
Accelerators, such as field programmable gate arrays (FPGAs) and graphics processing units (GPUs), are special purpose processors designed to speed up compute-intensive sections of applications. FPGAs are highly customizable, while GPUs provide massive parallel execution resources and high memory bandwidth. In this paper, we compare the performance of these architectures, presenting a performance...
The problem of automatically generating hardware modules from a high level representation of an application has been at the research forefront in the last few years. In this paper, we use OpenCL, an industry supported standard for writing programs that execute on multicore platforms and accelerators such as GPUs. Our architectural synthesis tool, SOpenCL (Silicon-OpenCL), adapts OpenCL into a novel...
This paper presents the implementation of Advanced Audio and Video Standard Part 2: Video (AVS P2), the Chinese video standard, to Diamond 388VDO Video Processor, a heterogeneous dual core Tensilica SIMD processor. Through the process of mapping AVS video decoder to 388VDO we aim to explore and exploit the different forms of parallelism inherent in a video application in order to speedup AVS decoding...
The problem of automatically generating hardware modules from a high level representation of an application has been at the forefront of EDA research in the last few years. Such an EDA methodology would potentially enable the large pool of software engineers and algorithm IP experts without architectural and hardware expertise to design and implement platform systems, thus dramatically reducing time...
Fisheye lenses are often used in scientific or virtual reality applications to enlarge the field of view of a conventional camera. Fisheye lens distortion correction is an image processing application which transforms the distorted fisheye images back to the natural-looking perspective space. This application is characterized by non-linear streaming memory access patterns that make main memory bandwidth...
Hardware accelerators are increasingly used to extend the computational capabilities of baseline scalar processors to meet the growing performance and power requirements of embedded applications. The challenge to the designer is the extensive human effort required to identify the appropriate kernels to be mapped to gates and to implement a network of accelerators to execute the kernels. In this paper,...
As digital imaging becomes more prevalent in consumer products, the industry strives to reduce the cost and the complexity of imaging solutions and, at the same time, to improve the color quality and resolution of the images. The proliferation of imaging in mobile phones, digital cameras, webcams, toys, etc. creates the need for a low cost, small footprint image sensor. Nowadays, sensor optics account...
In this paper the authors propose a pre-synthesis register queue size estimation technique for an unscheduled streaming DFG (sDFG) for pipelined synthesis. Our estimation method first designates a minimum queue size to each communication edge of the sDFG based on the ALAP value of the source node and ASAP value of the sink node of that edge. Our aim is to further refine this initial minimum queue...
One of the major challenges in automated synthesis of reconfigurable accelerators is to create efficient designs that conform to the resource capacity of the target device. This work concerns estimation of the hardware cost before actually attempting the synthesis of streaming accelerators on reconfigurable platforms. Specifically, our proposed framework tackles the problem of presynthesis estimation...
Modern FPGA platforms provide the hardware and software infrastructure for building a bus-based system on chip (SoC) that meet the applications requirements. The designer can customize the hardware by selecting from a large number of pre-defined peripherals and fixed IP functions and by providing new hardware, typically expressed using RTL. Hardware accelerators that provide application-specific extensions...
In this paper, we focus on low-power design techniques for high-performance processors at the architectural and compiler levels. We focus mainly on developing methods for reducing the energy dissipated in the on-chip caches. Energy dissipated in caches represents a substantial portion in the energy budget of today's processors. Extrapolating current trends, this portion is likely to increase in the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.