2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Current generation GPUs can accelerate high-performance, compute-intensive applications by exploiting massive thread-level parallelism. The high performance, however, comes at the cost of increased power consumption. Recently, commercial GPGPU architectures have introduced support for concurrent kernel execution to better utilize the computational/memory resources and thereby improve overall throughput...

chapter

Characterizing and enhancing global memory data coalescing on GPUs

Naznin Fauzia, Louis-Noel Pouchet, P. Sadayappan

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 12 - 22

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Effective parallel programming for GPUs requires careful attention to several factors, including ensuring coalesced access of data from global memory. There is a need for tools that can provide feedback to users about statements in a GPU kernel where non-coalesced data access occurs, and assistance in fixing the problem. In this paper, we address both these needs. We develop a two-stage framework...

chapter

Automatic data placement into GPU on-chip memory resources

Chao Li, Yi Yang, Zhen Lin, Huiyang Zhou

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 23 - 33

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Although graphics processing units (GPUs) rely on thread-level parallelism to hide long off-chip memory access latency, judicious utilization of on-chip memory resources, including register files, shared memory, and data caches, is critical to application performance. However, explicitly managing GPU on-chip memory resources is a non-trivial task for application developers. More importantly, as on-chip...

chapter

A parallel abstract interpreter for JavaScript

Kyle Dewey, Vineeth Kashyap, Ben Hardekopf

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 34 - 45

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

We investigate parallelizing flow- and context-sensitive static analysis for JavaScript. Previous attempts to parallelize such analyses for other languages typically start with the traditional framework of sequential dataflow analysis, and then propose methods to parallelize the existing sequential algorithms within this framework. However, we show that this approach is non-optimal and propose a new...

chapter

MemorySanitizer: Fast detector of uninitialized memory use in C++

Evgeniy Stepanov, Konstantin Serebryany

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 46 - 55

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

This paper presents MemorySanitizer, a dynamic tool that detects uses of uninitialized memory in C and C++. The tool is based on compile time instrumentation and relies on bit-precise shadow memory at run-time. Shadow propagation technique is used to avoid false positive reports on copying of uninitialized memory. MemorySanitizer finds bugs at a modest cost of 2.5× in execution time and 2× in memory...

chapter

On performance debugging of unnecessary lock contentions on multicore processors: A replay-based approach

Long Zheng, Xiaofei Liao, Bingsheng He, Song Wu, more

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 56 - 67

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Locks have been widely used as an effective synchronization mechanism among processes and threads. However, we observe that a large number of false inter-thread dependencies (i.e., unnecessary lock contentions) exist during the program execution on multicore processors, thereby incurring significant performance overhead. This paper presents a performance debugging framework, PerfPlay, to facilitate...

chapter

Optimizing binary translation of dynamically generated code

Byron Hawkins, Brian Demsky, Derek Bruening, Qin Zhao

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 68 - 78

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Dynamic binary translation serves as a core technology that enables a wide range of important tools such as profiling, bug detection, program analysis, and security. Many of the target applications often include large amounts of dynamically generated code, which poses a special performance challenge in maintaining consistency between the source application and the translated application. This paper...

chapter

Getting in control of your control flow with control-data isolation

William Arthur, Ben Mehne, Reetuparna Das, Todd Austin

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 79 - 90

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Computer security has become a central focus in the information age. Though enormous effort has been expended on ensuring secure computation, software exploitation remains a serious threat. The software attack surface provides many avenues for hijacking; however, most exploits ultimately rely on the successful execution of a control-flow attack. This pervasive diversion of control flow is made possible...

chapter

Reactive tiling

Jithendra Srinivas, Wei Ding, Mahmut Kandemir

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 91 - 102

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

To fully exploit the power of emerging multicore architectures, managing shared resources (i.e., caches) across applications and over time is critical. However, to our knowledge, most prior efforts view this problem from the OS/hardware side, and do not consider whether applications themselves can also participate in this process of managing shared resources. In this paper, we show how an application...

chapter

Branch prediction and the performance of interpreters — Don't trust folklore

Erven Rohou, Bharath Narasimha Swamy, Andre Seznec

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 103 - 114

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Interpreters have been used in many contexts. They provide portability and ease of development at the expense of performance. The literature of the past decade covers analysis of why interpreters are slow, and many software techniques to improve them. A large proportion of these works focuses on the dispatch loop, and in particular on the implementation of the switch statement: typically an indirect...

chapter

Optimizing the flash-RAM energy trade-off in deeply embedded systems

James Pallister, Kerstin Eder, Simon J. Hollis

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 115 - 124

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Deeply embedded systems often have the tightest constraints on energy consumption, requiring that they consume tiny amounts of current and run on batteries for years. However, they typically execute code directly from flash, instead of the more energy efficient RAM. We implement a novel compiler optimization¹ that exploits the relative efficiency of RAM by statically moving carefully selected basic...

chapter

EMEURO: A framework for generating multi-purpose accelerators via deep learning

Lawrence McAfee, Kunle Olukotun

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 125 - 135

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Approximate computing is a very promising design paradigm for crossing the CPU power wall, primarily driven by the potential to sacrifice output quality for significant gains in performance, energy, and fault tolerance. Unfortunately, existing solutions have primarily either focused on new programming models, or new hardware designs, leaving significant room between these two ends for software-based...

INFONA - science communication portal

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Front cover

Title page

Copyright page

External reviewers

CGO'15 organizing committee

Message from the general chairs

CGO 2015 sponsors & supporters

Table of contents

Improving GPGPU energy-efficiency through concurrent kernel execution and DVFS

Characterizing and enhancing global memory data coalescing on GPUs

Automatic data placement into GPU on-chip memory resources

A parallel abstract interpreter for JavaScript

MemorySanitizer: Fast detector of uninitialized memory use in C++

On performance debugging of unnecessary lock contentions on multicore processors: A replay-based approach

Optimizing binary translation of dynamically generated code

Getting in control of your control flow with control-data isolation

Reactive tiling

Branch prediction and the performance of interpreters — Don't trust folklore

Optimizing the flash-RAM energy trade-off in deeply embedded systems

EMEURO: A framework for generating multi-purpose accelerators via deep learning

Filter options

Publication date

Keywords

INFONA - science communication portal

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)