The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The increasing demand for extracting value out of ever-growing data poses an ongoing challenge to system designers, a task only made trickier by the end of Dennard scaling. As the performance density of traditional CPU-centric architectures stagnates, advancing compute capabilities necessitates novel architectural approaches. Near-memory processing (NMP) architectures are reemerging as promising candidates...
Non-Volatile Memories (NVMs) can significantly improve the performance of data-intensive applications. A popular form of NVM is Battery-backed DRAM, which is available and in use today with DRAMs latency and without the endurance problems of emerging NVM technologies. Modern servers can be provisioned with up-to 4 TB of DRAM, and provisioning battery backup to write out such large memories is hard...
Caches are traditionally organized as a rigid hierarchy, with multiple levels of progressively larger and slower memories. Hierarchy allows a simple, fixed design to benefit a wide range of applications, since working sets settle at the smallest (i.e., fastest and most energy-efficient) level they fit in. However, rigid hierarchies also add overheads, because each level adds latency and energy even...
Heterogeneous memory management combined with server virtualization in datacenters is expected to increase the software and OS management complexity. State-of-the-art solutions rely exclusively on the hypervisor (VMM) for expensive page hotness tracking and migrations, limiting the benefits from heterogeneity. To address this, we design HeteroOS, a novel application-transparent OS-level solution for...
Dedicated hardware accelerators enable energy-efficient implementations of radio and imaging basebands. Multistandard, multi-mode radio basebands require an on-the-fly reconfigurable fast Fourier transform (FFT) accelerator that implements many different FFT sizes. An instance of a runtime-reconfigurable 2n3m5k FFT accelerator was generated by a custom hardware generator to meet the requirements of...
High Efficiency Video Coding (HEVC) is the new video compression standard. A novel optimized architecture of Integer Motion Estimation (IME) for HEVC processing 8K video is presented in this paper. This architecture achieves 8K (7680×4320) video in real time at 43 fps (frames per second) with a frequency of 142 MHz and a latency of 402 clock cycles. The proposed design has been synthesized and simulated...
Image compression is of a great importance in multimedia system applications because it drastically reduces bandwidth for transmission and memory storage. Image compression algorithm, like JPEG2000, utilizes the Forward Discrete Wavelet Transform (FDWT) and Inverse Discrete Wavelet Transform (IDWT). The main problems face by researchers in the hardware implementation of the FDWT/IDWT are storage memory,...
This work presents a hardware implementation of the morphological reconstruction algorithm for biomedical images analysis. The morphological reconstruction algorithm is based on the Sequential Reconstruction (SR). In this case. a hardware architecture has been developed and implemented by mapping the SR algorithm into an Altera Cyclone IV E FPGA based platform. including a NIOS II processor. The developed...
This paper focuses on the systematic design and development of an optimized embedded system platform from ground up to handle special real-time requirements of control applications. An embedded system is usually embedded as a part of a complete device including hardware and mechanical parts, and both the hardware system and the software system are tailored and optimized to realize specific functions...
We present a miniaturized universal hardware module for acoustic pattern recognition in various types of multichannel sensor signals. The module implements configurable signal analysis (signal transforms, filter banks, statistical transforms) and a GMM-HMM recognizer. The main hardware components are a XC7A75T FPGA performing almost all the computations, a TMS320C6746 digital signal processor organizing...
In this paper, we propose a 2-D grouping FIFO based FFT hardware architecture, supporting 36 different FFT sizes defined in 3GPP-LTE systems. Also, the important design foundation is to develop a hybrid-radix computing kernel engine, including 4 configuration types. In a design implementation via TSMC 90-nm CMOS technology, the reconfigurable FFT chip only has a core area occupation of 1.51 mm2, dissipating...
Universal Filtered Orthogonal Frequency Division Multiplexing (UF-OFDM) is considered one of the main wave-form candidates to overcome the challenges facing the next generation of mobile communication systems. Due to its spectral properties it can support relaxed synchronization, low-latency communications and flexible time transmission interval. Nevertheless, the available recent literature addresses...
Designing wireless industrial communication for factory automation is a serious task because novel radio systems have to compete with well established wired fieldbus solutions. The requirements to achieve equivalent reliability and latencies impose challenging demands on design and implementation. In this paper, we propose a hardware accelerator for highly adaptive and parallel medium access allowing...
Data compression technology is the necessary technology in the age of big data. Compared with software compression techniques, hardware compression techniques can improve speed and reduce power consumption. LZMA is a lossless compression technology, and its hardware implementation has broad application prospects. This paper proposes a novel high-performance implementation of the LZMA compression algorithm...
DRAM is a crucial component in computing systems, and is expected to be even more important as data-intensive applications become more prominent. A key challenge in advancing DRAM technology is the growing cost of refresh operations, which can impose a large impact on the energy efficiency of DRAM modules. Existing refresh mitigation techniques all require hardware modifications, which may be undesirable...
Network virtualization offers flexibility by decoupling virtual network from the underlying physical network. Software-Defined Network (SDN) could utilize the virtual network. For example, in Software-Defined Networks, the entire network can be run on commodity hardware and operating systems that use virtual elements. However, this could present new challenges of data plane performance. In this paper,...
Modern multi-core systems employ shared memory architecture, entailing problems related to the main memory such as row-buffer conflicts, time-varying hot-spots across memory channels, and superfluous switches between reads and writes originating from different cores. There have been proposals to solve these problems by partitioning main memory across banks and/or channels such that a DRAM bank is...
He VMware ESXi hypervisor attracts a wide range of customers and is deployed in domains ranging from desktop computing to server computing. While the software systems are increasingly moving towards consolidation, hardware has already transitioned into multi-socket Non-Uniform Memory Access (NUMA)-based systems. The marriage of increasing consolidation and the multi-socket based systems warrants low-overhead,...
Over the past few years we have articulated theory that describes ‘encrypted computing’, in which data remains in encrypted form while being worked on inside a processor, by virtue of a modified arithmetic. The last two years have seen research and development on a standards-compliant processor that shows that near-conventional speeds are attainable via this approach. Benchmark performance with the...
Principle of graphic collection system based on PCI9054 and DSP was introduced in order to collect information outside rapidly and instantly in the paper. The telecommunication of collection card and host computer by PCI9054 interface chip was introduced, the logical programmed module was introduced simplified the design of digital logical circuit and implemented the transform of graphics format,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.