The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Variable Pipeline Cool Mega Array (VPCMA) is an low power Coarse Grained Reconfigurable Architecture (CGRA) based on the concept of CMA (Cool Mega Array). It implements a pipeline structure that can be configured depending on performance requirements, and the silicon on thin buried oxide (SOTB) technology that allows to control its body bias voltage to balance performance and leakage power. In this...
With the growing importance of deep learning and energy-saving approximate computing, half precision floating point arithmetic (FP16) is fast gaining popularity. Nvidia's recent Pascal architecture was the first GPU that offered FP16 support. However, when actual products were shipped, programmers soon realized that a naïve replacement of single precision (FP32) code with half precision led to disappointing...
Due to increasing demand of low power computing, and diminishing returns from technology scaling, industry and academia are turning with renewed interest toward energy-efficient programmable accelerators. This paper proposes an Integrated Programmable-Array accelerator (IPA) architecture based on an innovative execution model, targeted to accelerate both data and control-flow parts of deeply embedded...
Programmable Controller is a prominent technology used for automation of industrial process controls. Ladder diagrams are specialized schematics, predominantly used in industries. Various tasks performed by Ladder diagram programming for Programmable Controllers are Boolean logic, timing, counting, shifting, arithmetic and comparison operations. This paper proposes arithmetic and comparison operations...
This paper presents a static-placement, dynamic-issue (SPDI) framework for the coarse-grained reconfigurable architecture (CGRA) in order to tackle the inefficiencies of the static-issue, static-placement (SISP) CGRA. This framework includes the compiler that statically places the operations and hardware design, a SPDI CGRA, that automatically schedule the operations. We stress on introducing the...
Cool mega-array (CMA) is a kind of coarse grained reconfigurable architecture (CGRA) which has shown its ability of ultra low-power computation. However, as CMA completely eliminates clock trees and registers, the performance improvement has been limited. In this paper, we introduce a variable pipeline structure to CMA with the minimum essential registers to provide more wide trade-off between performance...
The territorial valorization of food products is tightly related to quality attributes and is currently the base of typification processes. In the construction of territorial identity based on honey coming from Apis-melifera bee, studies integrating melissopalinological and sensory analysis, and physical-chemical parameters have a significant weight in the definition of the botanical origin and allow...
Real-valued arithmetic has a fundamental impact on the performance and accuracy of scientific computation. As scientific application developers prepare their applications for exascale computing, many are investigating the possibility of using either lower precision (for better performance) or higher precision (for more accuracy). However, exploring alternative representations often requires significant...
Many linear algebra libraries, such as the Intel MKL, Magma or Eigen, provide fast Cholesky factorization. These libraries are suited for big matrices but perform slowly on small ones. Even though State-of-the-Art studies begin to take an interest in small matrices, they usually feature a few hundreds rows. Fields like Computer Vision or High Energy Physics use tiny matrices. In this paper we show...
Due to the concerns of two dimensional layout and structural modularity, interprocess or data transfers for VLSI arrays, such as systolic/wavefront processors, are normally achieved by way of neighborhood communication. Although interconnection networks are designed to enhance global communication for non-systolic types of processing, it is not feasible to incorporate the processors and global interconnections...
The computational speed of a conventional von Neumann computer architecture is limited primarily by the data path bottleneck between the memory and the central processing unit (CPU). One solution to this problem is to eliminate the separation between the CPU and memory by moving the processor functions directly into the memory. The Database Accelerator (DBA) chip is an attempt to merge these two functions...
This paper explores the grain of domain size of an energy efficient coarse grained reconfigurable array called CMA (Cool Mega Array). By using Genetic Algorithm based body bias assignment method, the leakage reduction of various grain size was evaluated. As a result, a domain with 2×1 PEs achieved about 40% power reduction with a 6% area overhead.
We are living in the world where we are surrounded by electronic gadgets with the expectation for more compact size and performance. The NAND flash memory is one of the key components in those aspects. NAND flash provides higher density & faster operations moreover its array structure is bit complex, which demands an efficient controlling. In this paper we presented a generic NAND flash controller...
Radiation-induced soft errors are a major reliability concern in circuits fabricated at advanced technology nodes. Online soft-error vulnerability estimation offers the flexibility of exploiting dynamic fault-tolerant mechanisms for cost-effective reliability enhancement. We propose a generic run-time method with low area and power overhead to predict the soft-error vulnerability of on-chip memory...
Normally-off computing (NoC) systems have constantly-off and instantly-on characteristics, leading to considerably lower idle power consumption than other low-power systems. This paper proposes a software procedure and two system hardware design optimization methods, namely a programmable restore entry decision for increasing system recovery correctness and nonvolatile (NV) storage reduction with...
The finite field multipliers consuming high-throughput rate and low-latency having grown excessive attention in recent cryptographic systems, and coding theory but such multipliers above Galois field GF(2m) for National institute standard technology (NIST) pentanomials are not so plentiful. We introduce two pairs of low latency and high throughput bit-parallel and digit-serial systolic multipliers...
Coarse-Grained Reconfigurable Architectures (CGRA) promise both low power and high performance coupled with flexibility, however automatic mapping of applications to such platforms remains a great research challenge. Efficient manual mapping of the data-centric kernels of applications yields great results, however these contain internally control-flow specific tasks, which introduce mapping irregularities...
Cool mega array-SOTB-2 (CMA-SOTB-2) is an ultra-low energy coarse grained reconfigurable architecture (CGRA) for advanced sensor networks, the Internet of Things, and wearable computing. It uses a large processing element (PE) array with combinatorial circuits and a micro-controller for data transfer between data memory and the PE array. To improve the energy efficiency of the previous prototype,...
Exchanging data on noncontiguous user buffers has been a dominant communication pattern in many scientific applications. The OpenSHMEM specification introduces a new set of communication routines to support strided data communication. Most high performance implementations of the OpenSHMEM specification support strided data communication by either packing/unpacking or multiple reads/writes based scheme,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.