The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
State-of-the-art studies show that FPGA-based hardware merge sorters (HMSs) can achieve superior performance compared with optimized algorithms on CPUs and GPUs. The performance of any HMS is proportional to its operating frequency (F) and the number of records that can be output each cycle (E). However, all existing HMSs have a problem that F drops significantly with increasing E due to the increase...
Markov Chain Monte Carlo (MCMC) based methods have been the main tool for Bayesian Inference for some years now, and recently they find increasing applications in modern statistics and machine learning. Nevertheless, with the availability of large datasets and increasing complexity of Bayesian models, MCMC methods are becoming prohibitively expensive for real-world problems. At the heart of these...
Sorting is one of the most fundamental and usefulapplications in computer science, and continues to be animportant tool in analyzing large datasets. An important andchallenging subclass of sorting problems involves sorting terabytescale datasets with hundreds of billions of records. Theconventional method of sorting such large amounts of datais to distribute the data and computation over a cluster...
In this paper, an approach to synthesize compressor trees in High-level Synthesis (HLS) for FPGAs is proposed. Our approach utilizes the bit-level information to improve the compressor tree synthesis. To obtain the bit-level information targeting compressor tree synthesis, a modified bitmask analysis technique based on prior work is proposed. A series of experimental results show that, compared to...
In this paper, we introduce SWiF – Simplified Workload-intuitive Framework – a workload-centric, application programming framework designed to simplify the large-scale deployment of FPGAs in end-to-end applications. SWiF can intelligently mediate access to shared resources by orchestrating the distribution and scheduling of tasks across a heterogeneous mix of FPGA and CPU resources in order to improve...
FPGAs play a crucial role in the space of customizable accelerators over the next few years. A chief limiting factor is that FPGA CAD tools are cumbersome and time-consuming to most application developers. Routing is the most complex step in FPGA design flow and NP-complete problem. The PathFinder routing algorithm is in dominant use in FPGA CAD research. However, PathFinder is sequential in nature...
Convolutional neural networks (CNNs) have recently broken many performance records in image recognition and object detection problems. The success of CNNs, to a great extent, is enabled by the fast scaling-up of the networks that learn from a huge volume of data. The deployment of big CNN models can be both computation-intensive and memory-intensive, leaving severe challenges to hardware implementations...
Field programmable gate arrays (FPGAs) have been adopted in various fields, due to the design flexibility and customizability. Different applications have different requirements in performance, hardware resources and cost, leading to demands of diverse FPGA architectures. Delay is an important metric to evaluate different alternatives during FPGA architecture development. The existing analytical delay...
Field-Programmable Gate Arrays (FPGAs) are susceptible to radiation-induced Single Event Upsets (SEUs). A common technique for dealing with SEUs is Triple Modular Redundancy (TMR) combined with Module-based configuration memory Error Recovery (MER). By triplicating components and voting on their outputs, TMR helps localize the configuration memory errors, and by reconfiguring the faulty component,...
Convolutional neural networks (CNNs) haveachieved great success in many applications. Recently, variousFPGA-based accelerators have been proposed to improve theperformance of CNNs. However, current most FPGA-basedmethods use single bit-width selection for all CNN layers, which lead to very low resource utilization efficiency anddifficulty in further performance improvement. In this paper, we propose...
We can build lightweight bit-serial FPGA NoC routers thatcost 20 LUT, 17 FF per router and operate at 800–900 MHzspeeds. Each bit-serial router implements deflection-routing on aunidirectional torus topology requiring 1b-wide connection perport. The key ideas that enable this implementation are (1)reformulation of the dimension-ordered routing (DOR) functionusing compact 1 LUT, 1 FF streaming pattern...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.