Search results

chapter

Jenga: Software-defined cache hierarchies

Po-An Tsai, Nathan Beckmann, Daniel Sanchez

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 652 - 665

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

Caches are traditionally organized as a rigid hierarchy, with multiple levels of progressively larger and slower memories. Hierarchy allows a simple, fixed design to benefit a wide range of applications, since working sets settle at the smallest (i.e., fastest and most energy-efficient) level they fit in. However, rigid hierarchies also add overheads, because each level adds latency and energy even...

chapter

An on-chip ADC BIST solution and the BIST enabled calibration scheme

Xiankun Jin, Tao Chen, Mayank Jain, Arun Kumar Barman, more

2017 IEEE International Test Conference (ITC) > 1 - 10

2017 IEEE International Test Conference (ITC)

This paper presents a complete on-chip ADC BIST solution based on a segmented stimulus error identification algorithm known as USER-SMILE. By adapting the algorithm for efficient hardware realization, the solution is implemented towards a 1Msps 12-bit SAR ADC on a 28nm CMOS automotive microcontroller. While sufficient test accuracy is demonstrated, the solution is further extended to correct linearity...

chapter

System-on-chip-based hardware acceleration for human detection in 2D/3D scenes

Amin Safaei, Q. M. Jonathan Wu, Thangarajah Akilan

2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 1041 - 1045

2017 IEEE International Conference on Systems, Man and Cybernetics (SMC)

A system-on-chip field gate programmable array (FPGA)-based video processing platform for human detection in complex scenes is presented. This study details the hardwarebased implementation of a human detection algorithm in 2D/3D scenes, including the capture, video processing, and display stages. The proposed method is implemented by extending a previously proposed method that uses features extracted...

chapter

One size does not fit all: Implementation trade-offs for iterative stencil computations on FPGAs

Gael Deest, Tomofumi Yuki, Sanjay Rajopadhye, Steven Derrien

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 8

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Iterative stencils are kernels in various application domains such as numerical simulations and medical imaging, that merit FPGA acceleration. The best architecture depends on many factors such as the target platform, off-chip memory bandwidth, problem size, and performance requirements. We generate a family of FPGA stencil accelerators targeting emerging System on Chip platforms, (e.g., Xilinx Zynq...

chapter

Accelerating in-system FPGA debug of high-level synthesis circuits using incremental compilation techniques

Pavan Kumar Bussa, Jeffrey Goeders, Steven J. E. Wilton

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

High-Level Synthesis has emerged as a promising technology for improving FPGA designer productivity, but will only be successful if it is accompanied by a debug ecosystem. Recent efforts have presented in-system debug techniques which allow a designer to debug an implementation, running on an FPGA, in the context of the original source code. These techniques typically store a history of all user variables...

chapter

OmniGraph: A Scalable Hardware Accelerator for Graph Processing

Chongchong Xu, Chao Wang, Lei Gong, Yuntao Lu, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 623 - 624

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Large-scale graphs processing attracts more and more attentions, and it has been widely applied in many application domains. FPGA is a promising platform to implement graph processing algorithms with high power-efficiency and parallelism. In this paper, we propose OmniGraph, a scalable hardware accelerator for graph processing. OmniGraph can process graphs with different sizes adaptively and is adaptable...

chapter

Quantifying the Potential Benefits of On-chip Near-Data Computing in Manycore Processors

Jagadish B. Kotra, Diana Guttman, Nachiappan Chidamabaram N., Mahmut T. Kandemir, more

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS) > 198 - 209

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)

Increasing data set sizes motivate for a shift of focus from computation-centric systems to data-centric systems, where data movement is treated as a first-class optimization metric. An example of this emerging paradigm is in-situ computing in largescale computing systems. Observing that data movement costs are increasing at an exponential rate even at a node level (as a node itself is fast-becoming...

chapter

SELF: A High Performance and Bandwidth Efficient Approach to Exploiting Die-Stacked DRAM as Part of Memory

Yuhua Guo, Qing Liu, Weijun Xiao, Ping Huang, more

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS) > 187 - 197

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)

Die-stacked DRAM (a.k.a., on-chip DRAM) provides much higher bandwidth and lower latency than off-chip DRAM. It is a promising technology to break the "memory wall". Die-stacked DRAM can be used either as a cache (i.e., DRAM cache) or as a part of memory (PoM). A DRAM cache design would suffer from more page faults than a PoM design as the DRAM cache cannot contribute towards capacity of...

chapter

A lightweight X-masking scheme for IoT designs

Daniel Tille, Benedikt Gottinger, Ulrike Pfannkuchen

2017 International Test Conference in Asia (ITC-Asia) > 77 - 82

2017 International Test Conference in Asia (ITC-Asia)

The emerging Internet-of-Things (IoT) paradigm creates a new market for very small and cost-sensitive chips. Design costs must be as low as possible in order to be competitive. In this context, the 1-pin test has proven to be a beneficial way to significantly reduce test costs. However, the incorporated signature generation requires an X-free design, which is not always possible (e.g. due to timing...

chapter

OCEAN: An on-chip incremental-learning enhanced processor with gated recurrent neural network accelerators

Chixiao Chen, Hongwei Ding, Huwan Peng, Haozhe Zhu, more

ESSCIRC 2017 - 43rd IEEE European Solid State Circuits Conference > 259 - 262

ESSCIRC 2017 - 43rd IEEE European Solid State Circuits Conference (ESSCIRC)

A deep learning processor with 8 gated recurrent neural network (RNN) accelerators is proposed in this paper. It features on-chip incremental learning by numerical and local gradient computation enhancement. Extra precision of training is obtained without extending the bit-width. Tri-mode weight access (DMA/FIFO/RAM) improves the throughput during incremental learning. The number multipliers and activation...

chapter

Design and implementation of an OpenRISC system-on-chip with an encryption peripheral

Latif Akcay, Mehmet Tukel, Berna Ors

2017 European Conference on Circuit Theory and Design (ECCTD) > 1 - 4

2017 European Conference on Circuit Theory and Design (ECCTD)

Open source hardware projects are becoming more and more common. OpenRISC SOC, one of the prominent of these projects, has become quite popular with the support of volunteer developers. In this work, we have demonstrated the design of an DES (Data Encryption Standard) based system, that can be used in security applications, on ORPSoC-v2 (Openrisc Reference Platform System-on-Chip). Additionally, we...

chapter

Algorithm and hardware co-optimized solution for large SpMV problems

Fazle Sadi, Larry Fileggi, Franz Franchetti

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2017 IEEE High Performance Extreme Computing Conference (HPEC)

Sparse Matrix-Vector multiplication (SpMV) is a fundamental kernel for many scientific and engineering applications. However, SpMV performance and efficiency are poor on commercial of-the-shelf (COTS) architectures, specially when the data size exceeds on-chip memory or last level cache (LLC). In this work we present an algorithm co-optimized hardware accelerator for large SpMV problems. We start...

chapter

Introduction to hardware-oriented security for MPSoCs

Ilia Polian, Francesco Regazzoni, Johanna Sepulveda

2017 30th IEEE International System-on-Chip Conference (SOCC) > 102 - 107

2017 30th IEEE International System-on-Chip Conference (SOCC)

Novel applications demand computational resources that are provided by multiprocessor systems-on-chip (MPSoCs). At the same time, they increasingly process sensitive data and incorporate security-relevant functions like encryption or authentication. This paper discusses the implications of the MPSoC technology on security. It provides an overview of hardware-oriented techniques to enhance security...

chapter

On the security evaluation of the ARM TrustZone extension in a heterogeneous SoC

El Mehdi Benhani, Cedric Marchand, Alain Aubert, Lilian Bossuet

2017 30th IEEE International System-on-Chip Conference (SOCC) > 108 - 113

2017 30th IEEE International System-on-Chip Conference (SOCC)

As the complexity of System-on-Chip (SoC) and the reuse of third party IP continues to grow, the security of a heterogeneous SoC has become a critical issue. In order to increase the software security of such SoC, the TrustZone technology has been proposed by ARM to enforce software security. Nevertheless, many SoC embed non-trusted third party Intellectual Property (IP) trying to take the benefits...

chapter

Packet Classification with Limited Memory Resources

Michal Kekely, Jan Korenek

2017 Euromicro Conference on Digital System Design (DSD) > 179 - 183

2017 Euromicro Conference on Digital System Design (DSD)

Network security and monitoring devices use packet classification to match packet header fields in a set of rules. Many hardware architectures have been designed to accelerate packet classification and achieve wire-speed throughput for 100 Gbps networks. The architectures are designed for high throughput even for the shortest packets. However, FPGA SoC and Intel Xeon with FPGA have limited resources...

chapter

Towards a Mobile Health Platform with Parallel Processing and Multi-sensor Capabilities

Florian Glaser, Philipp Schonle, Pascale Meier, Jonathan Bosser, more

2017 Euromicro Conference on Digital System Design (DSD) > 462 - 469

2017 Euromicro Conference on Digital System Design (DSD)

We present ongoing work on a platform for mobile health and implantable telemetry devices with powerful point-of-contact processing capabilities based on our VivoSoC multi-sensor medical instrumentation SoC, a custom power management IC, and only a few additional components - allowing the realisation of sub-ccm devices. We detail the powerful yet efficient acquisition and parallel processing capabilities...

chapter

Using On-chip cryptographic units for security in wireless sensor networks

Janhavi Kulkarni, Karan Nair, Aditya Pappu, Sarthak Gadre, more

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS) > 1252 - 1255

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)

This paper explores the use of On-chip cryptographic units for implementing security in low cost wireless sensor networks. The objective of this research is to reduce the deployment time and computational complexity of security protocols in WSNs, whilst keeping security related performance parameters at par with the current state-of-the-art. A method is proposed to continue using simple radio transreceiver...

chapter

A Scalable Parameterized NoC Emulator Built Upon Xilinx Virtex-7 FPGA

Ming Zhu, Yingtao Jiang, Mei Yang, Louie De Luna

2017 25th International Conference on Systems Engineering (ICSEng) > 287 - 290

2017 25th International Conference on Systems Engineering (ICSEng)

A number of critical design decisions, such as network topology, buffer sizes, flow control mechanism and so on so forth, have to be evaluated in any NoC the design. Designs and verifications of NoCs are based on either software simulations, which are extremely slow and inaccurate for complex models, or hardware emulations using low/mid-class FPGAs, where the scalability of the NoC system is intensively...

chapter

Optimizations of Two Compute-Bound Scientific Kernels on the SW26010 Many-Core Processor

James Lin, Zhigeng Xu, Akira Nukada, Naoya Maruyama, more

2017 46th International Conference on Parallel Processing (ICPP) > 432 - 441

2017 46th International Conference on Parallel Processing (ICPP)

The home-grown SW26010 many-core processor enabled the production of China’s first independently developed number-one ranked supercomputer – the Sunway TaihuLight. The design of the limited off-chip memory bandwidth, however, renders the SW26010 a highly memory-bound processor. To compensate for this limitation, the processor was designed with a unique hardware feature, "Register Level Communication"...

chapter

In-Place Irregular Computation for Message-Passing Chip-Multiprocessors

Zhang Youhui, Zhang Youyang, Li Yanhua, Fei Xiang, more

2017 46th International Conference on Parallel Processing Workshops (ICPPW) > 69 - 76

2017 46th International Conference on Parallel Processing Workshops (ICPPW)

With the increase of CMP (Chip-Multiprocessor) scale, moving data to computation on chip becomes more expensive. Accordingly, moving computation to data has potential to improve efficiency. We propose an in-place computation co-design of many-simple-core CMP for irregular applications. The computing paradigm is that an application's critical irregular data (or part of them) is partitioned into on-chip...

INFONA - science communication portal

Search results

Jenga: Software-defined cache hierarchies

An on-chip ADC BIST solution and the BIST enabled calibration scheme

System-on-chip-based hardware acceleration for human detection in 2D/3D scenes

One size does not fit all: Implementation trade-offs for iterative stencil computations on FPGAs

Accelerating in-system FPGA debug of high-level synthesis circuits using incremental compilation techniques

OmniGraph: A Scalable Hardware Accelerator for Graph Processing

Quantifying the Potential Benefits of On-chip Near-Data Computing in Manycore Processors

SELF: A High Performance and Bandwidth Efficient Approach to Exploiting Die-Stacked DRAM as Part of Memory

A lightweight X-masking scheme for IoT designs

OCEAN: An on-chip incremental-learning enhanced processor with gated recurrent neural network accelerators

Design and implementation of an OpenRISC system-on-chip with an encryption peripheral

Algorithm and hardware co-optimized solution for large SpMV problems

Introduction to hardware-oriented security for MPSoCs

On the security evaluation of the ARM TrustZone extension in a heterogeneous SoC

Packet Classification with Limited Memory Resources

Towards a Mobile Health Platform with Parallel Processing and Multi-sensor Capabilities

Using On-chip cryptographic units for security in wireless sensor networks

A Scalable Parameterized NoC Emulator Built Upon Xilinx Virtex-7 FPGA

Optimizations of Two Compute-Bound Scientific Kernels on the SW26010 Many-Core Processor

In-Place Irregular Computation for Message-Passing Chip-Multiprocessors

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options