The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The article describes the hardware architecture of computational units for the construction of GLONASS / GPS navigation user equipment. The described possible route to improve of some architectures based on field programmable gate arrays (FPGAs).
Modular multiplication, addition, and subtraction being the core operation of Elliptic curve public(ECC) system, the decrease of area and the merging of structure have been a hot topic in recent years. This paper first analyzes the difference between multiplication type and addition type of modular multiplier. Then, Combined with the structural characteristics of the modular adder, and mixing modular...
Decimal Arithmetic Hardware Research accelerated phenomenally in the last decade with introduction of Decimal Floating Point formats in IEEE 754–2008. ‘Addition’ being one of the primitive arithmetic operations has attracted numerous literary proposals involving the 8421 standard BCD code as well as nonstandard decimal digit representation codes (4221, 5211 etc.). This paper concentrates on Fixed...
This paper presents a new low complexity architecture of least-mean-square (LMS) adaptive filter using distributed arithmetic (DA). The DA based LMS adaptive filter requires lookup tables (LUTs) for filtering and weight updating operation whose complexities grow exponential with filter order. In the proposed technique, the complexity of LUT for DA based LMS adaptive filter is reduced by two new serial...
In this paper, we propose an energy-efficient approximate multiplier design approach. Fundamental to this approach is configurable lossy logic compression, coupled with low-cost error mitigation. The logic compression is aimed at reducing the number of product rows using progressive bit significance, and thereby decreasing the number of reduction stages in Wallace-tree accumulation. This accounts...
This paper presents a comparison between custom fixed-point (FxP) and floating-point (FlP) arithmetic, applied to bidimensional K-means clustering algorithm. After a discussion on the K-means clustering algorithm and arithmetic characteristics, hardware implementations of FxP and FlP arithmetic operators are compared in terms of area, delay and energy, for different bitwidth, using the ApxPerf2.0...
This paper presents area-efficient building blocks for computing fast Fourier transform (FFT): multiplierless processing elements to be used for computing of radix-3 and radix-5 butterflies and reconfigurable processing element supporting mixed radix-2/3/4/5 FFT algorithms. The proposed processing elements are based on Wingorad Fourier transform algorithm. However, multiplication is performed by constant...
Advances in convolutional neural network (CNN) have aroused great interests all over the world. Despite the fact that the amount of convolutions in CNNs is proportional to that of layers, people tend to pursue more remarkable performance by exploiting a deep convolution neural network (DCNN), leading to large area occupation. With the deeper process involved in large-scale integrated circuits, circuit...
How to effectively cultivate students' practical ability and innovative spirit is the subject of Computer Science in Colleges and Universities, especially for the first-year or second-year undergraduate students. This paper introduces the experimental teaching reform trial of the Digital Logic courses, and sums up the experience of how to stimulate students' awareness of innovation in the hardware...
Large number addition is the fundamental operation in cryptography algorithms. In this paper, we accelerate large addition in hardware design by introducing non-least-positive form, which is beneficial to parallel processing. An implementation of 256-bit signed array accumulator with our method shows an improvement of 18% in speed and 15% in area-delay product compared with traditional design.
Energy efficiency has become a primary concern in the design of multimedia digital systems, particularly when targeting mobile devices. Approximate computing is a highly promising approach to address this challenge. This paper presents an architectural exploration in a variable block size motion estimation (VBSME) architecture using imprecise Lower-Part-OR Adders (LOA). These adders were applied to...
This paper reviews the hardware requirements of generalised single photon ADC-less receiver circuits for visible light communications. A receiver based on parallel banks of pulse combiners and pipelined adders is shown to provide over a magnitude of circuit area reduction over the generalised structure.
The benefits of customising the precision throughout an FPGA design according to a design tolerance are well known. However, customising the precision of a design at runtime has the potential for an even greater performance impact. In this paper, we add the ability to dynamically choose the internal precision of a datapath. This enables a result that is at least as accurate as the worst-case under...
Unpredictable value of carry bit in the summation of carry-sum is an annoying issue preventing carry-sum from being applied to designs with sign extension. In this paper, we propose a methodology to determine the carry bit of the carry-sum form output generated by Booth encoded multiplier without final addition. We discover that this carry bit is a constant "1" in traditional unsigned Modified...
FPGAs are well known for their ability to perform non-standard computations not supported by classical microprocessors. Many libraries of highly customizable application-specific IPs have exploited this capablity. However, using such IPs usually requires handcrafted HDL, hence significant design efforts. High Level Synthesis (HLS) lowers the design effort thanks to the use of C/C++ dialects for programming...
Magnetic field calculations are by far the most computationally demanding part of a micromagnetic simulation — there are significant efforts to use hardware accelerators (such as GPUs) to speed up calculations. Dedicated hardware, such as FPGAs could offer even higher performance, and flexibility / reprogrammability is usually not a requirement at this level of the computation. In this paper we present...
Future Video Coding (FVC) is a new international video compression standard offering much better compression efficiency than previous video compression standards at the expense of much higher computational complexity. In this paper, an FPGA implementation of FVC 2D transform is proposed. The proposed FVC 2D transform hardware can perform 2D DCT-II, DCT-V, DCT-VIII, DST-I, DST-VII operations for 4×4...
Elliptic curve digital signature algorithm (ECDSA) is similar to Digital signature algorithm (DSA) except that the operations in the former are defined using points on Elliptic curve. This paper presents implementation details of ECDSA in prime fields over NIST recommended field sizes starting from 192 to 521 bits. The implementation uses a hardware-software co-design approach on reconfigurable hardware...
A new technique for computing the truncated cube of an operand at length of power two is proposed, implemented, analyzed, and compared to existing techniques. The new proposed method is comparable to previously proposed methods that compute the cube of an operand in parallel. Post layout results are presented in a 65nm Application Specific Integrated Circuit implementation and are compared against...
This paper presents a novel runtime-reconfigurable, mixed radix core for computation 2-, 3-, 4— point fast Fourier transforms (FFT). The proposed architecture is based on radix-3 Wingorad Fourier transform, however multiplication is performed by constant multiplication instead of general multiplier. The complexity is equal to multiplierless 3-point FFT in terms of adders/subtractors with the exception...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.