The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue for the foreseeable future, it is important to study techniques that can make it run fast on parallel hardware. In this paper, we provide the first analysis of a technique called BUCKWILD! that uses both asynchronous execution and low-precision...
The cryptographic chip is widely used in government, military, finance, business and other fields, so the requirement of security is very high. The globalization of the integrated circuit supply chains has promoted the rapid development of the industry, but the chip is also vulnerable to malicious modified by the attacker, namely hardware Trojan implanted. The paper proposed a hardware Trojan efficient...
In this work, we propose an efficient architecture for the hardware realization of deep neural networks on reconfigurable computing platforms like FPGA. The proposed neural network architecture employs only one single physical computing layer to perform the whole computational fabric of fully-connected feedforward deep neural networks with customizable number of layers, number of neurons per layer...
Modular multiplication, addition, and subtraction being the core operation of Elliptic curve public(ECC) system, the decrease of area and the merging of structure have been a hot topic in recent years. This paper first analyzes the difference between multiplication type and addition type of modular multiplier. Then, Combined with the structural characteristics of the modular adder, and mixing modular...
The iterative property of inverse butterfly permutation network makes it possible to implement shift operation with simple routing algorithm, which has high application value in cryptography, digital image processing and other fields. Based on the inverse butterfly network, this paper proposes a subword shift unit, which integrates the operations of subword rotation shift, subword logical shift and...
Considering the high precision and small load of the 3D printer motion control system, this paper proposed a method of multi-dimensional stepper motor coordinated control. The overall design of the control system is given; based on Bresenham algorithm, the Bresenham algorithm is extended from two-dimensional to multidimensional, realizing cooperative movement of the multidimensional stepper motor;...
This work presents a low-area scalable architecture for the Depth Modelling Mode 1 (DMM-1) encoder of the 3D High Efficiency Video Coding (3D-HEVC) standard, removing the refinement stage. This simplification causes a small BD-rate increase (0.09%) but a significant reduction in memory usage of 30%. The scalable architecture can support different block sizes. Synthesis results for ST 65 nm Standard...
This work presents a hardware implementation of the morphological reconstruction algorithm for biomedical images analysis. The morphological reconstruction algorithm is based on the Sequential Reconstruction (SR). In this case. a hardware architecture has been developed and implemented by mapping the SR algorithm into an Altera Cyclone IV E FPGA based platform. including a NIOS II processor. The developed...
This paper presents a complete on-chip ADC BIST solution based on a segmented stimulus error identification algorithm known as USER-SMILE. By adapting the algorithm for efficient hardware realization, the solution is implemented towards a 1Msps 12-bit SAR ADC on a 28nm CMOS automotive microcontroller. While sufficient test accuracy is demonstrated, the solution is further extended to correct linearity...
In this paper we focus on the issues of hardware implementation of genetic algorithms (GA) in hardware. In their classic implementation, the genetic algorithms search for a global minimum or maximum of a multidimensional function called the fitness function. If the problem, i.e. the fitness function, is too complex for a brute force search, we can look for a solution based on GA. In this situation...
Advances in convolutional neural network (CNN) have aroused great interests all over the world. Despite the fact that the amount of convolutions in CNNs is proportional to that of layers, people tend to pursue more remarkable performance by exploiting a deep convolution neural network (DCNN), leading to large area occupation. With the deeper process involved in large-scale integrated circuits, circuit...
In this paper, belief propagation (BP) detection based on max-sum (MS) algorithm for massive multiple-input multiple-output (MIMO) systems is therefore proposed to reduce computational complexity of general belief propagation. Owing to employing the approximation strategy, complexity reduction of MS is at the expense of detection performance loss. Based on MS algorithm, two effective approaches are...
Data compression technology is the necessary technology in the age of big data. Compared with software compression techniques, hardware compression techniques can improve speed and reduce power consumption. LZMA is a lossless compression technology, and its hardware implementation has broad application prospects. This paper proposes a novel high-performance implementation of the LZMA compression algorithm...
In this paper, a low-cost accelerator for the ηT pairing in characteristic three over the super-singular elliptic curves is designed. As the critical operations of ηT pairing, the cubing and sparse multiplications over GF(36m) in the Miller's algorithm are merged and their arithmetic are modified and scheduled to reduce the intermediate data related overhead. With these optimizations, the Miller's...
Large number addition is the fundamental operation in cryptography algorithms. In this paper, we accelerate large addition in hardware design by introducing non-least-positive form, which is beneficial to parallel processing. An implementation of 256-bit signed array accumulator with our method shows an improvement of 18% in speed and 15% in area-delay product compared with traditional design.
In this paper, we propose two different hardware structure of SHA-3 hash algorithm for different width of circuit interface. They both support the four functions SHA3-224/256/384/512 of SHA-3 algorithm. The padding unit of our design is also implemented by hardware instead of software. Besides, a 3-round-in-1 structure is proposed to speed up the throughput of our circuit. We conduct an implementation...
Information systems have become necessary component of the modern world. They encompass all areas of the human activities. Each area imposes its own, individual specifics, which leads to solve a number of unique problems to each of the system. Fuzzy methods allow an optimal way to overcome the difficulties that arise in any information system.
The article is devoted to the search for ways to protect the data privacy for Internet of things. Different technologies and algorithms of privacy protection are considered, their computing and energy intensity is investigated. Prototyping is performed on the FPGA of the Internet of things. Searching for ways to minimize energy consumption and computing costs of the Internet of things with algorithms...
The general system properties of distributed computer systems realized in the global computing environment are analyzed. The reasons for the reproduction of heterogeneity in it and its vulnerability to unauthorized exposure of executable programs are revealed. The principles of the formation in it of the universally programmable and cybersecurity algorithmic space for distributed computing by network...
Toom-Cook algorithm is a well-known method to compute large integer multiplication. In this paper, we propose an implementation of 272 bit multiplier based on Toom-Cook algorithm and finish the hardware implementation. Sythesizing with Synopsys Design Compiler in the SMIC 65nm CMOS process, the result shows that the design based on Toom-Cook can acheive at least 22.9% less on area and 43.4% less on...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.