Search results

chapter

Understanding and optimizing asynchronous low-precision stochastic gradient descent

Christopher De Sa, Matthew Feldman, Christopher Re, Kunle Olukotun

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 561 - 574

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue for the foreseeable future, it is important to study techniques that can make it run fast on parallel hardware. In this paper, we provide the first analysis of a technique called BUCKWILD! that uses both asynchronous execution and low-precision...

chapter

An efficient triggering method of hardware Trojan in AES cryptographic circuit

Xin Chuan, Yingjian Yan, Yilun Zhang

2017 2nd IEEE International Conference on Integrated Circuits and Microsystems (ICICM) > 91 - 95

2017 2nd IEEE International Conference on Integrated Circuits and Microsystems (ICICM)

The cryptographic chip is widely used in government, military, finance, business and other fields, so the requirement of security is very high. The globalization of the integrated circuit supply chains has promoted the rapid development of the industry, but the chip is also vulnerable to malicious modified by the attacker, namely hardware Trojan implanted. The paper proposed a hardware Trojan efficient...

chapter

Deep neural network accelerator based on FPGA

Thang Viet Huynh

2017 4th NAFOSTED Conference on Information and Computer Science > 254 - 257

2017 4th NAFOSTED Conference on Information and Computer Science

In this work, we propose an efficient architecture for the hardware realization of deep neural networks on reconfigurable computing platforms like FPGA. The proposed neural network architecture employs only one single physical computing layer to perform the whole computational fabric of fully-connected feedforward deep neural networks with customizable number of layers, number of neurons per layer...

chapter

Research and design of add-based length-scalable dual-field modular multiplication-addition-subtraction

Jiamin Li, Zibin Dai, Wei Li, Suwen Yi, more

2017 2nd IEEE International Conference on Integrated Circuits and Microsystems (ICICM) > 48 - 52

2017 2nd IEEE International Conference on Integrated Circuits and Microsystems (ICICM)

Modular multiplication, addition, and subtraction being the core operation of Elliptic curve public(ECC) system, the decrease of area and the merging of structure have been a hot topic in recent years. This paper first analyzes the difference between multiplication type and addition type of modular multiplier. Then, Combined with the structural characteristics of the modular adder, and mixing modular...

chapter

Research and design of subword shift unit based on inverse butterfly network

Pengfei Hou, Zibin Dai, Junwei Li, Chao Ma

2017 2nd IEEE International Conference on Integrated Circuits and Microsystems (ICICM) > 330 - 334

2017 2nd IEEE International Conference on Integrated Circuits and Microsystems (ICICM)

The iterative property of inverse butterfly permutation network makes it possible to implement shift operation with simple routing algorithm, which has high application value in cryptography, digital image processing and other fields. Based on the inverse butterfly network, this paper proposes a subword shift unit, which integrates the operations of subword rotation shift, subword logical shift and...

chapter

Design of multi-step stepper motor coordinated control system based on bresenham algorithm

Min Dai, Yuan Chen, Chaoqiang Zheng, Guo Yiming

2017 24th International Conference on Mechatronics and Machine Vision in Practice (M2VIP) > 1 - 5

2017 24th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)

Considering the high precision and small load of the 3D printer motion control system, this paper proposed a method of multi-dimensional stepper motor coordinated control. The overall design of the control system is given; based on Bresenham algorithm, the Bresenham algorithm is extended from two-dimensional to multidimensional, realizing cooperative movement of the multidimensional stepper motor;...

chapter

Low-area scalable hardware architecture for DMM-1 encoder of 3D-HEVC video coding standard

Gustavo Sanchez, Luciano Agostini, Filipo Mor, Cesar Marcon

2017 30th Symposium on Integrated Circuits and Systems Design (SBCCI) > 36 - 40

2017 30th Symposium on Integrated Circuits and Systems Design (SBCCI)

This work presents a low-area scalable architecture for the Depth Modelling Mode 1 (DMM-1) encoder of the 3D High Efficiency Video Coding (3D-HEVC) standard, removing the refinement stage. This simplification causes a small BD-rate increase (0.09%) but a significant reduction in memory usage of 30%. The scalable architecture can support different block sizes. Synthesis results for ST 65 nm Standard...

chapter

Efficient hardware implementation of morphological reconstruction based on sequential reconstruction algorithm

Oscar Anacona-Mosquera, Gustavo Vinhal, Renato C. Sampaio, George Teodoro, more

2017 30th Symposium on Integrated Circuits and Systems Design (SBCCI) > 162 - 167

2017 30th Symposium on Integrated Circuits and Systems Design (SBCCI)

This work presents a hardware implementation of the morphological reconstruction algorithm for biomedical images analysis. The morphological reconstruction algorithm is based on the Sequential Reconstruction (SR). In this case. a hardware architecture has been developed and implemented by mapping the SR algorithm into an Altera Cyclone IV E FPGA based platform. including a NIOS II processor. The developed...

chapter

An on-chip ADC BIST solution and the BIST enabled calibration scheme

Xiankun Jin, Tao Chen, Mayank Jain, Arun Kumar Barman, more

2017 IEEE International Test Conference (ITC) > 1 - 10

2017 IEEE International Test Conference (ITC)

This paper presents a complete on-chip ADC BIST solution based on a segmented stimulus error identification algorithm known as USER-SMILE. By adapting the algorithm for efficient hardware realization, the solution is implemented towards a 1Msps 12-bit SAR ADC on a 28nm CMOS automotive microcontroller. While sufficient test accuracy is demonstrated, the solution is further extended to correct linearity...

chapter

Selected aspects and tradeoffs in transistor level implementation of genetic algorithms

Slawomir Jezewski, Rafal Dlugosz

2017 IEEE 30th International Conference on Microelectronics (MIEL) > 235 - 238

2017 IEEE 30th International Conference on Microelectronics (MIEL)

In this paper we focus on the issues of hardware implementation of genetic algorithms (GA) in hardware. In their classic implementation, the genetic algorithms search for a global minimum or maximum of a multidimensional function called the fitness function. If the problem, i.e. the fitness function, is too complex for a brute force search, we can look for a solution based on GA. In this situation...

chapter

Efficient fast convolution architecture based on stochastic computing

Runing Xu, Bo Yuan, Xiaohu You, Chuan Zhang

2017 9th International Conference on Wireless Communications and Signal Processing (WCSP) > 1 - 6

2017 9th International Conference on Wireless Communications and Signal Processing (WCSP)

Advances in convolutional neural network (CNN) have aroused great interests all over the world. Despite the fact that the amount of convolutions in CNNs is proportional to that of layers, people tend to pursue more remarkable performance by exploiting a deep convolution neural network (DCNN), leading to large area occupation. With the deeper process involved in large-scale integrated circuits, circuit...

chapter

Belief propagation detection based on max-sum algorithm for massive MIMO systems

Yaping Zhang, Lulu Ge, Xiaohu You, Chuan Zhang

2017 9th International Conference on Wireless Communications and Signal Processing (WCSP) > 1 - 6

2017 9th International Conference on Wireless Communications and Signal Processing (WCSP)

In this paper, belief propagation (BP) detection based on max-sum (MS) algorithm for massive multiple-input multiple-output (MIMO) systems is therefore proposed to reduce computational complexity of general belief propagation. Owing to employing the approximation strategy, complexity reduction of MS is at the expense of detection performance loss. Based on MS algorithm, two effective approaches are...

chapter

Implementation of the LZMA compression algorithm on FPGA

Xia Zhao, Bing Li

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC) > 1 - 2

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC)

Data compression technology is the necessary technology in the age of big data. Compared with software compression techniques, hardware compression techniques can improve speed and reduce power consumption. LZMA is a lossless compression technology, and its hardware implementation has broad application prospects. This paper proposes a novel high-performance implementation of the LZMA compression algorithm...

chapter

A 0.3mm² 280MHz GF(3^m) η_T pairing accelerator for lightweight system

Xusheng Wang, Xiangyu Li

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC) > 1 - 2

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC)

In this paper, a low-cost accelerator for the η_T pairing in characteristic three over the super-singular elliptic curves is designed. As the critical operations of η_T pairing, the cubing and sparse multiplications over GF(3^6m) in the Miller's algorithm are merged and their arithmetic are modified and scheduled to reduce the intermediate data related overhead. With these optimizations, the Miller's...

chapter

A trick for parallel accumulation of signed array

Jinnan Ding, Shuguo Li

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC) > 1 - 2

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC)

Large number addition is the fundamental operation in cryptography algorithms. In this paper, we accelerate large addition in hardware design by introducing non-least-positive form, which is beneficial to parallel processing. An implementation of 256-bit signed array accumulator with our method shows an improvement of 18% in speed and 15% in area-delay product compared with traditional design.

chapter

High throughput design and implementation of SHA-3 hash algorithm

Xufan Wu, Shuguo Li

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC) > 1 - 2

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC)

In this paper, we propose two different hardware structure of SHA-3 hash algorithm for different width of circuit interface. They both support the four functions SHA3-224/256/384/512 of SHA-3 algorithm. The padding unit of our design is also implemented by hardware instead of software. Besides, a 3-round-in-1 structure is proposed to speed up the throughput of our circuit. We conduct an implementation...

chapter

The solution of the information tasks using fuzzy methods

Olga A. Kozlova, Ludmila P. Kozlova

2017 IEEE II International Conference on Control in Technical Systems (CTS) > 367 - 369

2017 IEEE II International Conference on Control in Technical Systems (CTS)

Information systems have become necessary component of the modern world. They encompass all areas of the human activities. Each area imposes its own, individual specifics, which leads to solve a number of unique problems to each of the system. Fuzzy methods allow an optimal way to overcome the difficulties that arise in any information system.

chapter

Research and implementation of the algorithm for data de-identification for Internet of Things

Dmitriy I. Kaplun, Denis V. Gnezdilov, George A. Efimenko, Aleksandr M. Sinitca, more

2017 IEEE II International Conference on Control in Technical Systems (CTS) > 363 - 366

2017 IEEE II International Conference on Control in Technical Systems (CTS)

The article is devoted to the search for ways to protect the data privacy for Internet of things. Different technologies and algorithms of privacy protection are considered, their computing and energy intensity is investigated. Prototyping is performed on the FPGA of the Internet of things. Searching for ways to minimize energy consumption and computing costs of the Internet of things with algorithms...

chapter

Cybersecurity in the mathematically uniform algorithmic space of the distributed computing

Yu. S. Zatuliveter, E. A. Fishchenko

2017 Tenth International Conference Management of Large-Scale System Development (MLSD) > 1 - 4

2017 Tenth International Conference "Management of Large-Scale System Development" (MLSD)

The general system properties of distributed computer systems realized in the global computing environment are analyzed. The reasons for the reproduction of heterogeneity in it and its vulnerability to unauthorized exposure of executable programs are revealed. The principles of the formation in it of the universally programmable and cybersecurity algorithmic space for distributed computing by network...

chapter

A low area ASIC implementation of 272 bit multiplier

Ruirui Liu, Shuguo Li

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC) > 1 - 2

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC)

Toom-Cook algorithm is a well-known method to compute large integer multiplication. In this paper, we propose an implementation of 272 bit multiplier based on Toom-Cook algorithm and finish the hardware implementation. Sythesizing with Synopsys Design Compiler in the SMIC 65nm CMOS process, the result shows that the design based on Toom-Cook can acheive at least 22.9% less on area and 43.4% less on...

INFONA - science communication portal

Search results

Understanding and optimizing asynchronous low-precision stochastic gradient descent

An efficient triggering method of hardware Trojan in AES cryptographic circuit

Deep neural network accelerator based on FPGA

Research and design of add-based length-scalable dual-field modular multiplication-addition-subtraction

Research and design of subword shift unit based on inverse butterfly network

Design of multi-step stepper motor coordinated control system based on bresenham algorithm

Low-area scalable hardware architecture for DMM-1 encoder of 3D-HEVC video coding standard

Efficient hardware implementation of morphological reconstruction based on sequential reconstruction algorithm

An on-chip ADC BIST solution and the BIST enabled calibration scheme

Selected aspects and tradeoffs in transistor level implementation of genetic algorithms

Efficient fast convolution architecture based on stochastic computing

Belief propagation detection based on max-sum algorithm for massive MIMO systems

Implementation of the LZMA compression algorithm on FPGA

A 0.3mm² 280MHz GF(3^m) η_T pairing accelerator for lightweight system

A trick for parallel accumulation of signed array

High throughput design and implementation of SHA-3 hash algorithm

The solution of the information tasks using fuzzy methods

Research and implementation of the algorithm for data de-identification for Internet of Things

Cybersecurity in the mathematically uniform algorithmic space of the distributed computing

A low area ASIC implementation of 272 bit multiplier

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options