The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The iterative property of inverse butterfly permutation network makes it possible to implement shift operation with simple routing algorithm, which has high application value in cryptography, digital image processing and other fields. Based on the inverse butterfly network, this paper proposes a subword shift unit, which integrates the operations of subword rotation shift, subword logical shift and...
Decimal Arithmetic Hardware Research accelerated phenomenally in the last decade with introduction of Decimal Floating Point formats in IEEE 754–2008. ‘Addition’ being one of the primitive arithmetic operations has attracted numerous literary proposals involving the 8421 standard BCD code as well as nonstandard decimal digit representation codes (4221, 5211 etc.). This paper concentrates on Fixed...
This paper outlines the feasibility of detecting epilepsy though low-cost and low-energy dedicated hardware with bit-serial processing. The concept of a novel bit-serial data processing unit (DPU) is presented which implements the functionality of a complete neuron. The proposed approach has been tested using various network configurations and compared with related work. The proposed DPU uses only...
This paper presents the design of a Coarse-grained Reconfigurable Architecture (CGRA), called MUSRA (Multimedia Specific Reconfigurable Architecture). The MUSRA is proposed to exploit multi-level parallelism of the computation-intensive loops in multimedia processing applications. To solve the huge bandwidth requirement of parallel processing arrays, the proposed architecture focuses on the exploitation...
We have been experiencing two very important movements in computing. On the one hand, a tremendous amount of resource has been invested into innovative applications such as first-principle-based methods, deep learning and cognitive computing. On the other hand, the industry has been taking a technological path where application performance and energy efficiency vary by more than two orders of magnitude...
The availability of intelligent embedded system to assist the classification application is a great challenge in machine learning field in last few decades. Extreme Learning Machine (ELM) is one of the best learning methods for the implementation due to its classification accuracy and speed. The main computational effort of ELM is to compute the pseudo-inverse of hidden layers output. This work presents...
Many-core systems are increasingly popular in embedded systems due to their high-performance and flexibility to execute different workloads. These many-core systems provide a rich processing fabric but lack the flexibility to accelerate critical operations with dedicated hardware cores. Modern Field Programmable Gate-Arrays (FPGAs) evolved to more than reconfigurable devices, providing embedded hard-core...
Today, both the rapid improvement of process technology and the arrival of new embedded systems with highperformance requirements, have led to making the current trend in processors manufacturing shift from single-core processors to multi-core processors. This trend has raised several challenges for reliability in safety-critical systems that operate in high-risk environments, making them more vulnerable...
Modern multi-core systems employ shared memory architecture, entailing problems related to the main memory such as row-buffer conflicts, time-varying hot-spots across memory channels, and superfluous switches between reads and writes originating from different cores. There have been proposals to solve these problems by partitioning main memory across banks and/or channels such that a DRAM bank is...
Many software mechanisms for geophysics exploration in Oil & Gas industries are based on wave propagation simulation. To perform such simulations, state-of-art HPC architectures are employed, generating results faster and with more accuracy at each generation. The software must evolve to support the new features of each design to keep performance scaling. Furthermore, it is important to understand...
Virtualization technology is well established in the server and desktop spaces, and has been spreading across embedded system market. This technology allows for the coexistence and execution of multiples operating systems on top of the same hardware platform, with proven technological and economic benefits. Hardware extensions for easing virtualization have been added into several commercial off-the-shelf...
The work describes a flexible framework built to generate various (parallel) software versions and to benchmark them. The framework is written with the use of the Python language with some support of the gnuplot plotting program. An example of the use of this tool shows the tuning of a matrix factorization on different architectures (Intel Haswell and Intel Knights Corner) with various parameters...
As an alternative of adding more and more instructions to CPU cores in order to address a wide range of applications, this paper examines to use a mixed grained CPU interlay fabric to provide reconfigurable instruction set extensions. In detail, we are examining to replace the hardened NEON SIMD unit of an ARM Cortex-A9 with an identical sized FPGA fabric. We show that by applying a set of optimizations,...
FPGAs are well known for their ability to perform non-standard computations not supported by classical microprocessors. Many libraries of highly customizable application-specific IPs have exploited this capablity. However, using such IPs usually requires handcrafted HDL, hence significant design efforts. High Level Synthesis (HLS) lowers the design effort thanks to the use of C/C++ dialects for programming...
While the memory footprints of cloud and HPC applications continue to increase, fundamental issues with DRAM scaling are likely to prevent traditional main memory systems, composed of monolithic DRAM, from greatly growing in capacity. Hybrid memory systems can mitigate the scaling limitations of monolithic DRAM by pairing together multiple memory technologies (e.g., different types of DRAM, or DRAM...
Large-scale graphs processing attracts more and more attentions, and it has been widely applied in many application domains. FPGA is a promising platform to implement graph processing algorithms with high power-efficiency and parallelism. In this paper, we propose OmniGraph, a scalable hardware accelerator for graph processing. OmniGraph can process graphs with different sizes adaptively and is adaptable...
Diversity and subdiversity-oriented systems applied in safety critical industry systems are analyzed through the use of the classification scheme described in standard NUREG7007. This classification is specified considering diversity of hardware and FPGA designs. In particular, diversity of hard logic and soft processors, interfaces and buses, self-diagnostics means, etc… are described. Impact of...
Achieving system fairness is a major design concern in current multicore processors. Unfairness arises due to contention in the shared resources of the system, such as the LLC and main memory. To address this problem, many research works have proposed novel cache partitioning policies aimed at addressing system fairness without harming performance. Unfortunately, existing proposals targeting fairness...
Open source hardware projects are becoming more and more common. OpenRISC SOC, one of the prominent of these projects, has become quite popular with the support of volunteer developers. In this work, we have demonstrated the design of an DES (Data Encryption Standard) based system, that can be used in security applications, on ORPSoC-v2 (Openrisc Reference Platform System-on-Chip). Additionally, we...
This paper presents an integrated design environment (IDE) for embedded fault-tolerant processor system. It takes in a processor core IP and the embedded software which is to be executed on the given processor, and turns them into a fault-tolerant system with various hardware and software mechanisms, subject to the designer's selection. The hardware options include dual redundancy for processor core,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.