2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Items from 1 to 20 out of 23 results

chapter

Title pages

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > i - vii

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

chapter

How sensitive is processor customization to the workload's input datasets?

Maximilien Breughe, Zheng Li, Yang Chen, Stijn Eyerman, more

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 1 - 7

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Hardware customization is an effective approach for meeting application performance requirements while achieving high levels of energy efficiency. Application-specific processors achieve high performance at low energy by tailoring their designs towards a specific workload, i.e., an application or application domain of interest. A fundamental question that has remained unanswered so far though is to...

chapter

TARCAD: A template architecture for reconfigurable accelerator designs

Muhammad Shafiq, Miquel Pericas, Nacho Navarro, Eduard Ayguade

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 8 - 15

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

In the race towards computational efficiency, accelerators are achieving prominence. Among the different types, accelerators built using reconfigurable fabric, such as FPGAs, have a tremendous potential due to the ability to customize the hardware to the application. However, the lack of a standard design methodology hinders the adoption of such devices and makes the portability and reusability across...

chapter

Customized MPSoC synthesis for task sequence

Liang Chen, Nicolas Boichat, Tulika Mitra

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 16 - 21

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Multiprocessor System-on-Chip (MPSoC) platforms have become increasingly popular for high-performance embedded applications. Each processing element (PE) on such platforms can be tuned to match the computational demands of the tasks executing on it, creating a heterogeneous multiprocessor system. Extensible processor cores, where the base instruction-set architecture can be augmented with application-specific...

chapter

Integrating formal verification and high-level processor pipeline synthesis

Eriko Nurvitadhi, James C. Hoe, Timothy Kam, Shih-Lien L. Lu

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 22 - 29

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

When a processor implementation is synthesized from a specification using an automatic framework, this implementation still should be verified against its specification to ensure the automatic framework introduced no error. This paper presents our effort in integrating fully automated formal verification with a high-level processor pipeline synthesis framework. As an integral part of the pipeline...

chapter

USHA: Unified software and hardware architecture for video decoding

Adarsha Rao, S. K. Nandy, Hristo Nikolov, Ed F. Deprettere

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 30 - 37

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Video decoders used in emerging applications need to be flexible to handle a large variety of video formats and deliver scalable performance to handle wide variations in workloads. In this paper we propose a unified software and hardware architecture for video decoding to achieve scalable performance with flexibility. The light weight processor tiles and the reconfigurable hardware tiles in our architecture...

chapter

Modular high-throughput and low-latency sorting units for FPGAs in the Large Hadron Collider

Amin Farmahini-Farahani, Anthony Gregerson, Michael Schulte, Katherine Compton

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 38 - 45

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

This paper presents efficient techniques for designing high-throughput, low-latency sorting units for FPGA implementation. Our sorting units use modular design techniques that hierarchically construct large sorting units from smaller building blocks. They are optimized for situations in which only the M largest numbers from N inputs are needed; this situation commonly occurs in high-energy physics...

chapter

Memory-efficient volume ray tracing on GPU for radiotherapy

Bo Zhou, X. Sharon Hu, Danny Z. Chen

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 46 - 51

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Ray tracing within a uniform grid volume is a fundamental process invoked frequently by many radiation dose calculation methods in radiotherapy. Recent advances of the graphics processing units (GPU) help real-time dose calculation become a reachable goal. However, the performance of the known GPU methods for volume ray tracing is all bounded by the memory-throughput, which leads to inefficient usage...

chapter

System integration of Elliptic Curve Cryptography on an OMAP platform

Sergey Morozov, Christian Tergino, Patrick Schaumont

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 52 - 57

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Elliptic Curve Cryptography (ECC) is popular for digital signatures and other public-key crypto-applications in embedded contexts. However, ECC is computationally intensive, and in particular the performance of the underlying modular arithmetic remains a concern. We investigate the design space of ECC on TI's OMAP 3530 platform, with a focus on using OMAP's DSP core to accelerate ECC computations...

chapter

ISIS: An accelerator for Sphinx speech recognition

Anthony Chun, Jenny X. Chang, Zhen Fang, Ravishankar Iyer, more

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 58 - 61

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

The ability to naturally interact with devices is becoming increasingly important. Speech recognition is one well-known solution to provide easy, hands-free user-device interaction. However, speech recognition has significant computation and memory bandwidth requirements, making it challenging to offer at high performance, real-time and ultra-low power for handheld devices. In this paper, we present...

chapter

Dynamically reconfigurable architecture for a driver assistant system

Naim Harb, Smail Niar, Mazen A. R. Saghir, Yassin El Hillali, more

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 62 - 65

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Application-specific programmable processors are increasingly being replaced by FPGAs, which offer high levels of logic density, rich sets of embedded hardware blocks, and a high degree of customizability and reconfigurability. New FPGA features such as Dynamic Partial Reconfiguration (DPR) can be leveraged to reduce resource utilization and power consumption while still providing high levels of performance...

chapter

FPGA based parallel architecture implementation of Stacked Error Diffusion algorithm

Rishvanth Kora Venugopal, J. Robert Heath, Daniel L. Lau

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 66 - 69

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Digital halftoning is a crucial technique used in digital printers to convert a continuous-tone image into a pattern of black and white dots. Halftoning is used since printers have a limited availability of inks and cannot reproduce all the color intensities in a continuous image. Error Diffusion is an algorithm in halftoning that iteratively quantizes pixels in a neighborhood dependent fashion. This...

chapter

3D recursive Gaussian IIR on GPU and FPGAs — A case for accelerating bandwidth-bounded applications

Jason Cong, Muhuan Huang, Yi Zou

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 70 - 73

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

GPU device typically has a higher off-chip bandwidth than FPGA-based systems. Thus typically GPU should perform better for bandwidth-bounded massive parallel applications. In this paper, we present our implementations of a 3D recursive Gaussian IIR on multi-core CPU, many-core GPU and multi-FPGA platforms. Our baseline implementation on the CPU features the smallest arithmetic computation (2 MADDs...

chapter

A fast CUDA implementation of agrep algorithm for approximate nucleotide sequence matching

Hongjian Li, Bing Ni, Man-Hon Wong, Kwong-Sak Leung

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 74 - 77

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

The availability of huge amounts of nucleotide sequences catalyzes the development of fast algorithms for approximate DNA and RNA string matching. However, most existing online algorithms can only handle small scale problems. When querying large genomes, their performance becomes unacceptable. Offline algorithms such as Bowtie and BWA require building indexes, and their memory requirement is high...

chapter

Frameworks for GPU Accelerators: A comprehensive evaluation using 2D/3D image registration

Richard Membarth, Frank Hannig, Jurgen Teich, Mario Korner, more

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 78 - 81

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

In the last decade, there has been a dramatic growth in research and development of massively parallel many-core architectures like graphics hardware, both in academia and industry. This changed also the way programs are written in order to leverage the processing power of a multitude of cores on the same hardware. In the beginning, programmers had to use special graphics programming interfaces to...

chapter

A massively parallel implementation of QC-LDPC decoder on GPU

Guohui Wang, Michael Wu, Yang Sun, Joseph R. Cavallaro

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 82 - 85

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

The graphics processor unit (GPU) is able to provide a low-cost and flexible software-based multi-core architecture for high performance computing. However, it is still very challenging to efficiently map the real-world applications to GPU and fully utilize the computational power of GPU. As a case study, we present a GPU-based implementation of a real-world digital signal processing (DSP) application:...

chapter

ARTE: An Application-specific Run-Time management framework for multi-core systems

Giovanni Mariani, Gianluca Palermo, Cristina Silvano, Vittorio Zaccaria

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 86 - 93

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Programmable multi-core and many-core platforms increase exponentially the challenge of task mapping and scheduling, provided that enough task-parallelism does exist for each application. This problem worsens when dealing with small ecosystems such as embedded systems-on-chip. In fact, in this case, the assumption of exploiting a traditional operating system is out of context given the memory available...

chapter

A hardware acceleration technique for gradient descent and conjugate gradient

David Kesler, Biplab Deka, Rakesh Kumar

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 94 - 101

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Application Robustification, a promising approach for reducing processor power, converts applications into numerical optimization problems and solves them using gradient descent and conjugate gradient algorithms [1]. The improvement in robustness, however, comes at the expense of performance when compared to the baseline non-iterative versions of these applications. To mitigate the performance loss...

chapter

A multi-threaded coarse-grained array processor for wireless baseband

Tom Vander Aa, Martin Palkovic, Matthias Hartmann, Praveen Raghavan, more

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 102 - 107

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Throughput of wireless communication standards ever increases. Computation requirements for systems implementing those standards increase even more. On battery operated devices, next to high performance a low power implementation is also crucial. Reaching this is only possible by utilizing parallelizations at all levels. The ADRES processor is an embedded coarse-grained reconfigurable baseband processor...

chapter

Hardware/software co-designed accelerator for vector graphics applications

Shuo-Hung Chen, Hsiao-Mei Lin, Hsin-Wen Wei, Yi-Cheng Chen, more

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 108 - 114

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

This paper proposes a new hardware accelerator to speed up the performance of vector graphics applications on complex embedded systems. The resulting hardware accelerator is synthesized on a field-programmable gate array (FPGA) and integrated with software components. The paper also introduces a hardware/software co-verification environment which provides in-system at-speed functional verification...

Publication date

Set your own date range

Keywords

HARDWARE (9)
FIELD PROGRAMMABLE GATE ARRAYS (8)
GRAPHICS PROCESSING UNIT (8)
ALGORITHM DESIGN AND ANALYSIS (5)
INSTRUCTION SETS (5)
COMPUTER ARCHITECTURE (4)
THREE DIMENSIONAL DISPLAYS (4)
THROUGHPUT (4)
ARRAYS (3)
DECODING (3)
PARALLEL PROCESSING (3)
RANDOM ACCESS MEMORY (3)
REGISTERS (3)
SOFTWARE (3)
ACCELERATION (2)
BANDWIDTH (2)
DIGITAL SIGNAL PROCESSING (2)
ENGINES (2)
GPU (2)
GRAPHICS (2)
HAZARDS (2)
LIBRARIES (2)
PIPELINES (2)
PIXEL (2)
SEMANTICS (2)
TRANSFORM CODING (2)
VLIW (2)
ADDERS (1)
ANALYTICAL MODELS (1)
APPLICATION SPECIFIC PROCESSOR (1)
APPROXIMATION ALGORITHMS (1)
ARITHMETIC CODER (1)
BASEBAND (1)
BENCHMARK TESTING (1)
BITPLANE CODER (1)
CLASSIFICATION ALGORITHMS (1)
COMPUTER AIDED ENGINEERING (1)
CONTEXT (1)
CONVOLUTION (1)
CUDA (1)
DIGITAL AUDIO PLAYERS (1)
DIGITAL HALFTONING (1)
DISCRETE WAVELET TRANSFORMS (1)
DRIVER CIRCUITS (1)
ELLIPTIC CURVE CRYPTOGRAPHY (1)
ENCODING (1)
EQUATIONS (1)
ESTIMATION (1)
FABRICS (1)
FACE (1)
FORMAL VERIFICATION (1)
GENERATORS (1)
GENOMICS (1)
HARDWARE ACCELERATORS (1)
HARDWARE DESIGN LANGUAGES (1)
HDL FUNCTIONAL/PERFORMANCE SIMULATION VALIDATION (1)
HEURISTIC ALGORITHMS (1)
HIDDEN MARKOV MODELS (1)
HUMANS (1)
IEEE 802.11N STANDARD (1)
IMAGE PROCESSOR (1)
IMAGE REGISTRATION (1)
INDEXES (1)
JPEG2000 (1)
KALMAN FILTERS (1)
KERNEL (1)
LAYOUT (1)
LDPC DECODER (1)
MEMORY MANAGEMENT (1)
MERGING (1)
MESSAGE SYSTEMS (1)
MONITORING (1)
MQ CODER (1)
MULTIPROCESSOR SYSTEM-ON-CHIP (1)
OBJECT DETECTION (1)
OPENCL/CUDA (1)
OPERATING SYSTEMS (1)
PARALLEL (1)
PARALLEL ARCHITECTURES (1)
PARALLEL COMPUTING (1)
PARITY CHECK CODES (1)
PAYLOADS (1)
PHANTOMS (1)
PIPELINE PROCESSING (1)
POLYNOMIALS (1)
POWER DEMAND (1)
PREDICTIVE MODELS (1)
PRINTERS (1)
PROCESS CONTROL (1)
PROGRAM PROCESSORS (1)
QUALITY OF SERVICE (1)
RADAR TRACKING (1)
RADIO FREQUENCY (1)
RAY TRACING (1)
RECONFIGURABLE ARCHITECTURE (1)
RESOURCE MANAGEMENT (1)
SCALABLE PARALLEL ARCHITECTURE (1)
SILICON (1)
SMOOTHING METHODS (1)
SORTING (1)
more

INFONA - science communication portal

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Title pages

How sensitive is processor customization to the workload's input datasets?

TARCAD: A template architecture for reconfigurable accelerator designs

Customized MPSoC synthesis for task sequence

Integrating formal verification and high-level processor pipeline synthesis

USHA: Unified software and hardware architecture for video decoding

Modular high-throughput and low-latency sorting units for FPGAs in the Large Hadron Collider

Memory-efficient volume ray tracing on GPU for radiotherapy

System integration of Elliptic Curve Cryptography on an OMAP platform

ISIS: An accelerator for Sphinx speech recognition

Dynamically reconfigurable architecture for a driver assistant system

FPGA based parallel architecture implementation of Stacked Error Diffusion algorithm

3D recursive Gaussian IIR on GPU and FPGAs — A case for accelerating bandwidth-bounded applications

A fast CUDA implementation of agrep algorithm for approximate nucleotide sequence matching

Frameworks for GPU Accelerators: A comprehensive evaluation using 2D/3D image registration

A massively parallel implementation of QC-LDPC decoder on GPU

ARTE: An Application-specific Run-Time management framework for multi-core systems

A hardware acceleration technique for gradient descent and conjugate gradient

A multi-threaded coarse-grained array processor for wireless baseband

Hardware/software co-designed accelerator for vector graphics applications

Filter options

Publication date

Keywords

INFONA - science communication portal

2011 IEEE 9th Symposium on Application Specific Processors (SASP) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2011 IEEE 9th Symposium on Application Specific Processors (SASP)