Search results

Items from 1 to 20 out of 399 results

chapter

Body bias optimization for variable pipelined CGRA

Takuya Kojima, Naoki Ando, Hayate Okuhara, Ng. Anh Vu Doan, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Variable Pipeline Cool Mega Array (VPCMA) is an low power Coarse Grained Reconfigurable Architecture (CGRA) based on the concept of CMA (Cool Mega Array). It implements a pipeline structure that can be configured depending on performance requirements, and the silicon on thin buried oxide (SOTB) technology that allows to control its body bias voltage to balance performance and leakage power. In this...

chapter

Exploiting half precision arithmetic in Nvidia GPUs

Nhut-Minh Ho, Weng-Fai Wong

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2017 IEEE High Performance Extreme Computing Conference (HPEC)

With the growing importance of deep learning and energy-saving approximate computing, half precision floating point arithmetic (FP16) is fast gaining popularity. Nvidia's recent Pascal architecture was the first GPU that offered FP16 support. However, when actual products were shipped, programmers soon realized that a naïve replacement of single precision (FP32) code with half precision led to disappointing...

chapter

A 142MOPS/mW integrated programmable array accelerator for smart visual processing

Satyajit Das, Davide Rossi, Kevin J. M. Martin, Philippe Coussy, more

2017 IEEE International Symposium on Circuits and Systems (ISCAS) > 1 - 4

2017 IEEE International Symposium on Circuits and Systems (ISCAS)

Due to increasing demand of low power computing, and diminishing returns from technology scaling, industry and academia are turning with renewed interest toward energy-efficient programmable accelerators. This paper proposes an Integrated Programmable-Array accelerator (IPA) architecture based on an innovative execution model, targeted to accelerate both data and control-flow parts of deeply embedded...

chapter

Arithmetic and Comparison Operations for Ladder Diagram Based Programmable Controller

Shobha S., K.R. Rekha., K.R. Nataraj

2017 International Conference on Recent Advances in Electronics and Communication Technology (ICRAECT) > 315 - 317

2017 International Conference on Recent Advances in Electronics and Communication Technology (ICRAECT)

Programmable Controller is a prominent technology used for automation of industrial process controls. Ladder diagrams are specialized schematics, predominantly used in industries. Various tasks performed by Ladder diagram programming for Programmable Controllers are Boolean logic, timing, counting, shifting, arithmetic and comparison operations. This paper proposes arithmetic and comparison operations...

chapter

A static-placement, dynamic-issue framework for CGRA loop accelerator

Zhongyuan Zhao, Weiguang Sheng, Weifeng He, ZhiGang Mao, more

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017 > 1348 - 1353

2017 Design, Automation & Test in Europe Conference & Exhibition (DATE)

This paper presents a static-placement, dynamic-issue (SPDI) framework for the coarse-grained reconfigurable architecture (CGRA) in order to tackle the inefficiencies of the static-issue, static-placement (SISP) CGRA. This framework includes the compiler that statically places the operations and hardware design, a SPDI CGRA, that automatically schedule the operations. We stress on introducing the...

chapter

Redundancy elimination revisited

Keith Cooper, Jason Eckhardt, Ken Kennedy

2008 International Conference on Parallel Architectures and Compilation Techniques (PACT) > 12 - 21

2008 International Conference on Parallel Architectures and Compilation Techniques (PACT)

This work proposes and evaluates improvements to previously known algorithms for redundancy elimination.

chapter

Variable pipeline structure for Coarse Grained Reconfigurable Array CMA

Naoki Ando, Koichiro Masuyama, Hayate Okuhara, Hideharu Amano

2016 International Conference on Field-Programmable Technology (FPT) > 217 - 220

2016 International Conference on Field-Programmable Technology (FPT)

Cool mega-array (CMA) is a kind of coarse grained reconfigurable architecture (CGRA) which has shown its ability of ultra low-power computation. However, as CMA completely eliminates clock trees and registers, the performance improvement has been limited. In this paper, we introduce a variable pipeline structure to CMA with the minimum essential registers to provide more wide trade-off between performance...

chapter

Building honey-based territorial identity for the Formosa Monte through information exploitation using intelligent systems

G. Cayu, G. A. Aguero, G. P. Balbarrey, M. M. Cabrera, more

2016 IEEE Congreso Argentino de Ciencias de la Informática y Desarrollos de Investigación (CACIDI) > 1 - 4

2016 IEEE Congreso Argentino de Ciencias de la Informática y Desarrollos de Investigación (CACIDI)

The territorial valorization of food products is tightly related to quality attributes and is currently the base of typification processes. In the construction of territorial identity based on honey coming from Apis-melifera bee, studies integrating melissopalinological and sensory analysis, and physical-chemical parameters have a significant weight in the definition of the botanical origin and allow...

chapter

Floating-Point Shadow Value Analysis

Michael O. Lam, Barry L. Rountree

2016 5th Workshop on Extreme-Scale Programming Tools (ESPT) > 18 - 25

2016 5th Workshop on Extreme-Scale Programming Tools (ESPT)

Real-valued arithmetic has a fundamental impact on the performance and accuracy of scientific computation. As scientific application developers prepare their applications for exascale computing, many are investigating the possibility of using either lower precision (for better performance) or higher precision (for more accuracy). However, exploring alternative representations often requires significant...

chapter

Batched Cholesky factorization for tiny matrices

Florian Lemaitre, Lionel Lacassagne

2016 Conference on Design and Architectures for Signal and Image Processing (DASIP) > 130 - 137

2016 Conference on Design and Architectures for Signal and Image Processing (DASIP)

Many linear algebra libraries, such as the Intel MKL, Magma or Eigen, provide fast Cholesky factorization. These libraries are suited for big matrices but perform slowly on small ones. Even though State-of-the-Art studies begin to take an interest in small matrices, they usually feature a few hundreds rows. Fields like Computer Vision or High Energy Physics use tiny matrices. In this paper we show...

chapter

A high speed shuffle bus for VLSI arrays

Wen-Tai Lin, Jyh-Ping Hwang

1987 Symposium on VLSI Circuits > 41 - 42

1987 Symposium on VLSI Circuits

Due to the concerns of two dimensional layout and structural modularity, interprocess or data transfers for VLSI arrays, such as systolic/wavefront processors, are normally achieved by way of neighborhood communication. Although interconnection networks are designed to enhance global communication for non-systolic types of processing, it is not feasible to incorporate the processors and global interconnections...

chapter

The MIT database accelerator: 2K-TRIT circuit design

Jon P. Wade, Peter J. Osler, Richard E. Zippel, Charles G. Sodini

1987 Symposium on VLSI Circuits > 39 - 40

1987 Symposium on VLSI Circuits

The computational speed of a conventional von Neumann computer architecture is limited primarily by the data path bottleneck between the memory and the central processing unit (CPU). One solution to this problem is to eliminate the separation between the CPU and memory by moving the processor functions directly into the memory. The Database Accelerator (DBA) chip is an attempt to merge these two functions...

chapter

Body bias grain size exploration for a coarse grained reconfigurable accelerator

Yusuke Matsushita, Hayate Okuhara, Koichiro Masuyama, Yu Fujita, more

2016 26th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2016 26th International Conference on Field Programmable Logic and Applications (FPL)

This paper explores the grain of domain size of an energy efficient coarse grained reconfigurable array called CMA (Cool Mega Array). By using Genetic Algorithm based body bias assignment method, the leakage reduction of various grain size was evaluated. As a result, a domain with 2×1 PEs achieved about 40% power reduction with a 6% area overhead.

chapter

Design & verification of ONFI complient high performance NAND flash controller

Akash Kumar, Jitendrabhai Ardeshana, Santosh Jagtap

2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT) > 942 - 945

2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)

We are living in the world where we are surrounded by electronic gadgets with the expectation for more compact size and performance. The NAND flash memory is one of the key components in those aspects. NAND flash provides higher density & faster operations moreover its array structure is bit complex, which demands an efficient controlling. In this paper we presented a generic NAND flash controller...

chapter

Online soft-error vulnerability estimation for memory arrays

Arunkumar Vijayan, Abhishek Koneru, Mojtaba Ebrahimit, Krishnendu Chakrabarty, more

2016 IEEE 34th VLSI Test Symposium (VTS) > 1 - 6

2016 IEEE 34th VLSI Test Symposium (VTS)

Radiation-induced soft errors are a major reliability concern in circuits fabricated at advanced technology nodes. Online soft-error vulnerability estimation offers the flexibility of exploiting dynamic fault-tolerant mechanisms for cost-effective reliability enhancement. We propose a generic run-time method with low area and power overhead to predict the soft-error vulnerability of on-chip memory...

chapter

An energy-efficient nonvolatile microprocessor considering software-hardware interaction for energy harvesting applications

Tsai-Kan Chien, Lih-Yih Chiou, Chang-Chia Lee, Yao-Chun Chuang, more

2016 International Symposium on VLSI Design, Automation and Test (VLSI-DAT) > 1 - 4

2016 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)

Normally-off computing (NoC) systems have constantly-off and instantly-on characteristics, leading to considerably lower idle power consumption than other low-power systems. This paper proposes a software procedure and two system hardware design optimization methods, namely a programmable restore entry decision for increasing system recovery correctness and nonvolatile (NV) storage reduction with...

chapter

Design and implementation of 16 bit systolic multiplier using modular shifting algorithm

S. Jayarajkumar, K. Sivanandam

2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM) > 532 - 537

2016 Second International Conference on Science Technology Engineering And Management (ICONSTEM)

The finite field multipliers consuming high-throughput rate and low-latency having grown excessive attention in recent cryptographic systems, and coding theory but such multipliers above Galois field GF(2^m) for National institute standard technology (NIST) pentanomials are not so plentiful. We introduce two pairs of low latency and high throughput bit-parallel and digit-serial systolic multipliers...

chapter

Design and synthesis of reconfigurable control-flow structures for CGRA

Zoltan Endre Rakossy, Axel Acosta-Aponte, Tobias G. Noll, Gerd Ascheid, more

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig) > 1 - 8

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig)

Coarse-Grained Reconfigurable Architectures (CGRA) promise both low power and high performance coupled with flexibility, however automatic mapping of applications to such platforms remains a great research challenge. Efficient manual mapping of the data-centric kernels of applications yields great results, however these contain internally control-flow specific tasks, which introduce mapping irregularities...

chapter

A 297mops/0.4mw ultra low power coarse-grained reconfigurable accelerator CMA-SOTB-2

Koichiro Masuyama, Yu Fujita, Hayate Okuhara, Hideharu Amano

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig) > 1 - 6

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig)

Cool mega array-SOTB-2 (CMA-SOTB-2) is an ultra-low energy coarse grained reconfigurable architecture (CGRA) for advanced sensor networks, the Internet of Things, and wearable computing. It uses a large processing element (PE) array with combinatorial circuits and a micro-controller for data transfer between data memory and the PE array. To improve the energy efficiency of the previous prototype,...

chapter

High Performance OpenSHMEM Strided Communication Support with InfiniBand UMR

Mingzhe Li, Khaled Hamidouche, Xiaoyi Lu, Jie Zhang, more

2015 IEEE 22nd International Conference on High Performance Computing (HiPC) > 244 - 253

2015 IEEE 22nd International Conference on High Performance Computing (HiPC)

Exchanging data on noncontiguous user buffers has been a dominant communication pattern in many scientific applications. The OpenSHMEM specification introduces a new set of communication routines to support strided data communication. Most high performance implementations of the OpenSHMEM specification support strided data communication by either packing/unpacking or multiple reads/writes based scheme,...

Keywords:
ARRAYS
REGISTERS

Publication date

Set your own date range

Content availability

Available (395)
None (4)

Keywords

CLOCKS (65)
HARDWARE (65)
FIELD PROGRAMMABLE GATE ARRAYS (58)
RANDOM ACCESS MEMORY (51)
COMPUTER ARCHITECTURE (37)
PROGRAM PROCESSORS (37)
KERNEL (32)
PARALLEL PROCESSING (32)
MICROPROCESSOR CHIPS (31)
OPTIMIZATION (31)
RECONFIGURABLE ARCHITECTURES (31)
SOFTWARE (30)
PIPELINES (29)
ALGORITHM DESIGN AND ANALYSIS (28)
SWITCHES (28)
DELAY (26)
MICROPROCESSORS (26)
RADIATION DETECTORS (26)
CMOS INTEGRATED CIRCUITS (25)
FPGA (25)
INSTRUCTION SETS (24)
LOGIC GATES (23)
DATA MINING (22)
DECODING (22)
CAPACITORS (20)
CONTEXT (20)
POWER DEMAND (20)
INDEXES (19)
BANDWIDTH (17)
MULTIPLEXING (17)
MEMORY MANAGEMENT (16)
BENCHMARK TESTING (15)
SYNCHRONIZATION (15)
SYSTEM-ON-A-CHIP (15)
THROUGHPUT (15)
ANALOGUE-DIGITAL CONVERSION (14)
COMPUTERS (14)
TESTING (14)
ACCURACY (13)
COMPUTATIONAL MODELING (13)
DIGITAL SIGNAL PROCESSING (13)
LOGIC DESIGN (13)
PIXEL (13)
REAL TIME SYSTEMS (13)
TABLE LOOKUP (13)
ADDERS (12)
CALIBRATION (12)
IMAGE PROCESSING (12)
PIPELINE PROCESSING (12)
ROUTING (12)
APPLICATION SPECIFIC INTEGRATED CIRCUITS (11)
EMBEDDED SYSTEMS (11)
SILICON (11)
VECTORS (11)
APPROXIMATION METHODS (10)
ASSEMBLY (10)
BUILT-IN SELF-TEST (10)
CMOS TECHNOLOGY (10)
COMPLEXITY THEORY (10)
CRYPTOGRAPHY (10)
DELAYS (10)
ENCODING (10)
LATCHES (10)
PARALLEL ARCHITECTURES (10)
PROCESS CONTROL (10)
PROTOCOLS (10)
RELIABILITY (10)
SIGNAL PROCESSING (10)
ACCELERATION (9)
CACHE STORAGE (9)
DATA STRUCTURES (9)
ENERGY CONSUMPTION (9)
GENERATORS (9)
GRAPHICS PROCESSING UNITS (9)
LAYOUT (9)
LIBRARIES (9)
SIMD (9)
VIDEO CODING (9)
COPROCESSORS (8)
DETECTORS (8)
DIGITAL SIGNAL PROCESSING CHIPS (8)
EQUATIONS (8)
MAGNETIC CORES (8)
MATHEMATICAL MODEL (8)
MICROCONTROLLERS (8)
MONITORING (8)
MULTICORE PROCESSING (8)
POWER CONSUMPTION (8)
RUNTIME (8)
STANDARDS (8)
SYSTEM-ON-CHIP (8)
TRANSFORMS (8)
WRITING (8)
CAPACITANCE (7)
CIRCUIT FAULTS (7)
CORRELATION (7)
DATA ACQUISITION (7)
DIGITAL-ANALOGUE CONVERSION (7)
more

INFONA - science communication portal

Search results

Body bias optimization for variable pipelined CGRA

Exploiting half precision arithmetic in Nvidia GPUs

A 142MOPS/mW integrated programmable array accelerator for smart visual processing

Arithmetic and Comparison Operations for Ladder Diagram Based Programmable Controller

A static-placement, dynamic-issue framework for CGRA loop accelerator

Redundancy elimination revisited

Variable pipeline structure for Coarse Grained Reconfigurable Array CMA

Building honey-based territorial identity for the Formosa Monte through information exploitation using intelligent systems

Floating-Point Shadow Value Analysis

Batched Cholesky factorization for tiny matrices

A high speed shuffle bus for VLSI arrays

The MIT database accelerator: 2K-TRIT circuit design

Body bias grain size exploration for a coarse grained reconfigurable accelerator

Design & verification of ONFI complient high performance NAND flash controller

Online soft-error vulnerability estimation for memory arrays

An energy-efficient nonvolatile microprocessor considering software-hardware interaction for energy harvesting applications

Design and implementation of 16 bit systolic multiplier using modular shifting algorithm

Design and synthesis of reconfigurable control-flow structures for CGRA

A 297mops/0.4mw ultra low power coarse-grained reconfigurable accelerator CMA-SOTB-2

High Performance OpenSHMEM Strided Communication Support with InfiniBand UMR

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options