The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Decision tree ensembles are commonly used in a wide range of applications and becoming the de facto algorithm for decision tree based classifiers. Different trees in an ensemble can be processed in parallel during tree inference, making them a suitable use case for FPGAs. Large tree ensembles, however, require careful mapping of trees to on-chip memory and management of memory accesses. As a result,...
Relational databases provide a wealth of functionality to a wide range of applications. Yet, there are tasks for which they are less than optimal, for instance when processing becomes more complex (e.g., regular expression evaluation, data analytics) or the data is less structured (e.g., text or long strings). With the increasing amount of user-generated data stored in relational databases, there...
Accelerating relational databases in general and SQL in particular has become an important topic given thechallenges arising from large data collections and increasinglycomplex workloads. Most existing work, however, has beenfocused on either accelerating a single operator (e.g., a join) orin data reduction along the data path (e.g., from disk to CPU). In this paper we focus instead on the system...
The data bus is a major component of high power consumption in small process high-performance systems and in systems-on-chip (SoC) design. This paper presents an analysis of different state-of-the-art techniques for reducing the power of off-chip memory bus interface, with proposing an approach overcoming some limitations existing in the state-of-art methods. More precisely, the paper introduces a...
Carry chains facilitate the implementation of adders and improve the performance of arithmetic circuits in FPGAs. The last version of the commonly used open-source Verilog-to-Routing (VTR) CAD flow now enables modelling carry chains in FPGA architectures. However, one of the shortcomings of the existing flow lies in its inability to identify arithmetic operations when described as gate-level circuits...
Field Programmable Gate Arrays (FPGAs) can be customized into application-specific architectures to achieve high performance and energy-efficiency. Unfortunately, they are yet to gain significant adoption by application developers due to their low-level programming model. Moreover, to obtain good performance in an FPGA design, one often needs to correctly parallelize computation and balance the computational...
The proliferation of heterogeneous computing platforms presents the parallel computing community with new challenges. One such challenge entails evaluating the efficacy of such parallel architectures and identifying the architectural innovations that ultimately benefit applications. To address this challenge, we need benchmarks that capture the execution patterns (i.e., dwarfs or motifs) of applications,...
In large-scale datapaths, complex interconnection requirements limit resource utilization and often dominate critical path delay. A variety of scheduling and binding algorithms have been proposed to reduce routing requirements by clustering frequently-used set of operations to avoid longer, inter-operational interconnects. In this paper we introduce a grammar induction approach for datapath synthesis...
The proliferation of heterogeneous computing systems presents the parallel computing community with the challenge of porting legacy and emerging applications to multiple processors with diverse programming abstractions. OpenCL is a vendor-agnostic and industry-supported programming model that offers code portability on heterogeneous platforms, allowing applications to be developed once and deployed...
The problem of automatically generating hardware modules from high level application representations has been at the forefront of EDA research during the last few years. In this paper, we introduce a methodology to automatically synthesize hardware accelerators from OpenCL applications. OpenCL is a recent industry supported standard for writing programs that execute on multicore platforms and accelerators...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.