2017 46th International Conference on Parallel Processing (ICPP)

Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.

chapter

Message from the Program Co-Chairs

2017 46th International Conference on Parallel Processing (ICPP) > xiii - xiv

Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.

chapter

Program Committee

2017 46th International Conference on Parallel Processing (ICPP) > xvii - xx

2017 46th International Conference on Parallel Processing (ICPP)

Provides a listing of current committee members and society officers.

chapter

Organizing Committee

2017 46th International Conference on Parallel Processing (ICPP) > xv - xvi

2017 46th International Conference on Parallel Processing (ICPP)

Provides a listing of current committee members and society officers.

chapter

Reviewers

2017 46th International Conference on Parallel Processing (ICPP) > xxi

2017 46th International Conference on Parallel Processing (ICPP)

The conference offers a note of thanks and lists its reviewers.

chapter

Preparing HPC Applications for the Exascale Era: A Decoupling Strategy

Ivy Bo Peng, Roberto Gioiosa, Gokcen Kestor, Erwin Laure, more

2017 46th International Conference on Parallel Processing (ICPP) > 1 - 10

2017 46th International Conference on Parallel Processing (ICPP)

Production-quality parallel applications are often a mixture of diverse operations, such as computation- and communication-intensive, regular and irregular, tightly coupled and loosely linked operations. In conventional construction of parallel applications, each process performs all the operations, which might result inefficient and seriously limit scalability, especially at large scale. We propose...

chapter

An Efficient, Distributed Stochastic Gradient Descent Algorithm for Deep-Learning Applications

Guojing Cong, Onkar Bhardwaj, Minwei Feng

2017 46th International Conference on Parallel Processing (ICPP) > 11 - 20

2017 46th International Conference on Parallel Processing (ICPP)

Parallel and distributed processing is employed to accelerate training for many deep-learning applications with large models and inputs. As it reduces synchronization and communication overhead by tolerating stale gradient updates, asynchronous stochastic gradient descent (ASGD), derived from stochastic gradient descent (SGD), is widely used. Recent theoretical analyses show ASGD converges with linear...

chapter

Large-Scale Parallelization of Smoothed Particle Hydrodynamics Method on Heterogeneous Cluster

Yingrui Wang, Leisheng Li, Rong Tian

2017 46th International Conference on Parallel Processing (ICPP) > 21 - 30

2017 46th International Conference on Parallel Processing (ICPP)

This paper implements a Smoothed Particle Hydrodynamics simulation code and distributes it on a heterogeneous cluster. The theoretical analysis results show that treating GPU as equivalent peer of CPU rather than an assistant or a substitute is the most efficient way of using a CPU+GPU compute node. However, it raises complex challenges of heterogeneous cooperation. Our strategies of hybrid-level...

chapter

Boosting the Efficiency of HPCG and Graph500 with Near-Data Processing

Erik Vermij, Leandro Fiorin, Christoph Hagleitner, Koen Bertels

2017 46th International Conference on Parallel Processing (ICPP) > 31 - 40

2017 46th International Conference on Parallel Processing (ICPP)

HPCG and Graph500 can be regarded as the two most relevant benchmarks for high-performance computing systems. Existing supercomputer designs, however, tend to focus on floating-point peak performance, a metric less relevant for these two benchmarks, leaving resources underutilized, and resulting in little performance improvements, for these benchmarks, over time. In this work, we analyze the implementation...

chapter

GCN: GPU-Based Cube CNN Framework for Hyperspectral Image Classification

Han Dong, Tao Li, Jiabing Leng, Lingyan Kong, more

2017 46th International Conference on Parallel Processing (ICPP) > 41 - 49

2017 46th International Conference on Parallel Processing (ICPP)

Hyperspectral image classification has been proved significant in remote sensing field. Traditional classification methods have meet bottlenecks due to the lack of remote sensing background knowledge or high dimensionality. Deep learning based methods, such as deep convolutional neural network (CNN), can effectively extract high level features from raw data. But the training of deep CNN is rather...

chapter

Nearly Balanced Work Partitioning for Heterogeneous Algorithms

Mallipeddi Hardhik, Dip Sankar Banerjee, Kiran Raj Ramamoorthy, Kishore Kothapalli, more

2017 46th International Conference on Parallel Processing (ICPP) > 50 - 59

2017 46th International Conference on Parallel Processing (ICPP)

The architectural trend towards heterogeneity has pushed heterogeneous computing to the fore of parallel computing research. Heterogeneous algorithms, often carefully handcrafted, have been designed for several important problems from parallel computing such as sorting, graph algorithms, matrix computations, and the like. A majority of these algorithms follow a work partitioning approach where the...

chapter

GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations

Adrian Castello, Sangmin Seo, Rafael Mayo, Pavan Balaji, more

2017 46th International Conference on Parallel Processing (ICPP) > 60 - 69

2017 46th International Conference on Parallel Processing (ICPP)

OpenMP is the de facto standard application programming interface (API) for on-node parallelism. The most popular OpenMP runtimes rely on POSIX threads (pthreads) implementations that offer an excellent performance for coarse-grained parallelism and match perfectly with the current hardware. However, a recent trend in runtimes/applications points in the direction of leveraging massive on-node parallelism...

chapter

Locality-Aware Dynamic Task Graph Scheduling

Jordyn Maglalang, Sriram Krishnamoorthy, Kunal Agrawal

2017 46th International Conference on Parallel Processing (ICPP) > 70 - 80

2017 46th International Conference on Parallel Processing (ICPP)

Dynamic task graph schedulers automatically balance work across processor cores by scheduling tasks among available threads while preserving dependences. In this paper, we design NABBITC, a provably efficient dynamic task graph scheduler that accounts for data locality on NUMA systems. NABBITC allows users to assign a color to each task representing the location (e.g., a processor core) that has the...

chapter

Practical Experience with Transactional Lock Elision

Tingzhe Zhou, Pante A Zardoshti, Michael Spear

2017 46th International Conference on Parallel Processing (ICPP) > 81 - 90

2017 46th International Conference on Parallel Processing (ICPP)

Transactional Memory (TM) promises both to provide a scalable mechanism for synchronization in concurrent programs, and to offer ease-of-use benefits to programmers. The most straightforward use of TM in real-world programs is in the form of Transactional Lock Elision (TLE). In TLE, critical sections are attempted as transactions, with a fall-back to a lock if conflicts manifest. Thus TLE expects...

chapter

Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning

Hartwig Anzt, Jack Dongarra, Goran Flegar, Enrique S. Quintana-Orti

2017 46th International Conference on Parallel Processing (ICPP) > 91 - 100

2017 46th International Conference on Parallel Processing (ICPP)

We present a set of new batched CUDA kernels for the LU factorization of a large collection of independent problems of different size, and the subsequent triangular solves. All kernels heavily exploit the registers of the graphics processing unit (GPU) in order to deliver high performance for small problems. The development of these kernels is motivated by the need for tackling this embarrasingly-parallel...

INFONA - science communication portal

2017 46th International Conference on Parallel Processing (ICPP)

[Front cover]

[Title page i]

[Title page iii]

[Copyright notice]

Table of contents

Message from the General Co-Chairs

Message from the Program Co-Chairs

Program Committee

Organizing Committee

Reviewers

Preparing HPC Applications for the Exascale Era: A Decoupling Strategy

An Efficient, Distributed Stochastic Gradient Descent Algorithm for Deep-Learning Applications

Large-Scale Parallelization of Smoothed Particle Hydrodynamics Method on Heterogeneous Cluster

Boosting the Efficiency of HPCG and Graph500 with Near-Data Processing

GCN: GPU-Based Cube CNN Framework for Hyperspectral Image Classification

Nearly Balanced Work Partitioning for Heterogeneous Algorithms

GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations

Locality-Aware Dynamic Task Graph Scheduling

Practical Experience with Transactional Lock Elision

Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 46th International Conference on Parallel Processing (ICPP) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 46th International Conference on Parallel Processing (ICPP)