The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Profile guided optimisation is a common technique used by compilers and runtime systems to shorten execution runtimes and to optimise locality aware scheduling and memory access on heterogeneous hardware platforms. Some profiling tools trace the execution of low level code, whilst others are designed for abstract models of computation to provide rich domain-specific context in profiling reports. We...
Polar decoders are well suited for high-speed software implementations. In this work, we present a framework for generating fully-unrolled software polar decoders with branchless data flow. We discuss the memory layout of data in these decoders and show the optimization techniques used. At 335 Mbps, when decoding a (2048, 1707) polar code, the resulting decoder has more than twice the speed of the...
Quasi-cyclic low-density parity-check (QC-LDPC) codes are used in numerous digital communication and storage systems. Layered LDPC decoding converges faster. To further increase the throughput, multiple block rows of the QC parity check matrix can be included in a layer. However, the maximum achievable clock frequency of the prior multi-block-row layered decoder is limited by the long critical path...
Real-time performance is critical for the successful realization of next generation radar systems. Among these are advanced Active Electronically Scanned Array (AESA) radars in which complex calculations are to be performed on huge sets of data in real-time. Manycore architectures are designed to provide flexibility and high performance essential for such streaming applications. This paper deals with...
Full use of the parallel computation capabilities of present and expected CPUs and CPUs require use of vector extensions. Yet many actors in data flow systems for digital signal processing have internal state (or, equivalently, an edge that loops from the actor back to itself) that impose serial dependencies between actor invocations that make vectorizing across actor invocations impossible. Ideally,...
This paper introduces a novel multicore scheduling method that leverages a parameterized dataflow Model of Computation (MoC). This method, which we have named Just-In-Time Multicore Scheduling (JIT-MS), aims to efficiently schedule Parameterized and Interfaced Synchronous DataFlow (PiSDF) graphs on multicore architectures. This method exploits features of PiSDF to And locally static regions that exhibit...
Algorithmic noise-tolerance (ANT) is an effective statistical error compensation (SEC) technique for designing energy-efficient digital signal processing systems. A conventional ANT system employs an explicit estimator block to compensate for the large magnitude errors in the main block. The estimator presents area and power overheads, as large as 40% of the main block, to the system. In this paper,...
This paper presents a framework for designing a class of distributed, asynchronous optimization algorithms, realized as signal processing architectures utilizing various conservation principles. The architectures are specifically based on stationarity conditions pertaining to primal and dual variables in a class of generally nonconvex optimization problems. The stationarity conditions, which are closely...
This paper provides examples of various synchronous and asynchronous signal processing systems for performing optimization, utilizing the framework and elements developed in a preceding paper. The general strategy in that paper was to perform a linear transformation of stationarity conditions applicable to a class of convex and nonconvex optimization problems, resulting in algorithms that operate...
Many varied domain experts use Lab VIEW as a graphical system design tool to implement DSP algorithms on myriad target architectures. In this paper, we introduce the latest LabVIEW FPGA compiler that enables domain experts with minimum hardware knowledge to quickly implement, deploy, and verify their domain-specific applications on FPGA hardware. We present two compiler techniques that we use to 1)...
The automated computation of muscle volume from MRI of human legs is an open problem in the biomédical imaging community. Such automation has the potential to provide an objective measure of effectiveness of pre- and post-surgery treatments. In this paper, we take a step toward automation by proposing a framework for user interactive segmentation of MRI of human leg muscles. Our framework is built...
Modern nanoscale processes exhibit stochastic behavior that can no longer be ignored. Statistical error compensation (SEC) has shown significant benefits in achieving energy efficiency and error resiliency by embracing the stochastic nature of the underlying process. Approximate computing (AC), on the other hand, employs deterministic designs that produce imprecise results to achieve energy efficiency...
Merging-based sorting networks are an important family of sorting networks. Most merge sorting networks are based on 2-way or multi-way merging algorithms using 2-sorters as basic building blocks. An alternative is to use n-sorters, instead of 2-sorters, as the basic building blocks so as to greatly reduce the number of gates as well as the latency. Based on a modified Leighton's columnsort algorithm,...
Medical Imaging has historically been very successful to expose the patient's anatomy beyond external visibility; thus, allowing more efficient and accurate treatments. The field of medicine continues to search for new techniques in order to increase accuracy, reduce complications, enable real-time feedback, allow early detection and reduce human errors. Various medical imaging techniques exist but...
We present a data driven method for high efficiency in a neuro-inspired vision pipeline. Our goal is to reduce low-utility computation arising from duplicated processing. In this paper, we examine two forms of redundant information in image data, spatiotemporal redundancy and channel redundancy. To maximize efficiency, the paper presents a dynamic, configurable approach that limits the computational...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.