Advanced search

From:

To:

Items from 1 to 20 out of 37 results

chapter

Parallel triangle counting and k-truss identification using graph-centric methods

Chad Voegele, Yi-Shan Lu, Sreepathi Pai, Keshav Pingali

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2017 IEEE High Performance Extreme Computing Conference (HPEC)

We describe CPU and GPU implementations of parallel triangle-counting and k-truss identification in the Galois and IrGL systems. Both systems are based on a graph-centric abstraction called the operator formulation of algorithms. Depending on the input graph, our implementations are two to three orders of magnitude faster than the reference implementations provided by the IEEE HPEC static graph challenge.

chapter

OpenACC Cache Directive: Opportunities and Optimizations

Ahmad Lashgar, Amirali Baniasadi

2016 Third Workshop on Accelerator Programming Using Directives (WACCPD) > 46 - 56

2016 Third Workshop on Accelerator Programming Using Directives (WACCPD)

OpenACC's programming model presents a simple interface to programmers, offering a trade-off between performance and development effort. OpenACC relies on compiler technologies to generate efficient code and optimize for performance. Among the difficult to implement directives, is the cache directive. The cache directive allows the programmer to utilize accelerator's hardware- or software-managed...

chapter

Exploiting recent SIMD architectural advances for irregular applications

Linchuan Chen, Peng Jiang, Gagan Agrawal

2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 47 - 58

2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

A broad class of applications involve indirect or data-dependent memory accesses and are referred to as irregular applications. Recent developments in SIMD architectures — specifically, the emergence of wider SIMD lanes, combination of SIMD parallelism with many-core MIMD parallelism, and more flexible programming APIs — are providing new opportunities as well as challenges for this class of applications...

chapter

Taguchi method or Compromise Programming as Robust Design optimization tool: The case of a flexible manufacturing system

Wa-Muzemba Anselm Tshibangu

2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO) > 2 > 485 - 492

2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO)

Competitive advantage of a firm is usually reflected through its superiority in production resources and performance outcomes. In order to achieve high performance (e.g., productivity) and significantly improve product quality, major US industries have promoted and implemented Robust Design (RD) techniques during the last decade. RD is a cost-effective procedure for determining the optimal settings...

chapter

A Branch-and-Bound Algorithm for Discrete Receive Beamforming with Improved Bounds

Johannes Israel, Andreas Fischer, John Martinovic

2015 IEEE International Conference on Ubiquitous Wireless Broadband (ICUWB) > 1 - 5

2015 IEEE International Conference on Ubiquitous Wireless Broadband (ICUWB)

We discuss the SINR-maximization problem for analog receive beamforming with finite resolution phase shifters and amplifiers. This discrete optimization problem can be globally solved by means of branch-and-bound. The performance of the algorithm depends on the quality of bounds for the subproblems which are determined within the branch-and-bound procedure. These subproblems can be generated by a...

chapter

Optimal array signaling for key establishment in static multipath channels

Rashid Mehmood, Jon W. Wallace, Michael A. Jensen

2015 9th European Conference on Antennas and Propagation (EuCAP) > 1 - 2

2015 9th European Conference on Antennas and Propagation (EuCAP)

This paper explores optimal array beamforming for secure communication in static multipath propagation environments. The problem is cast as a convex optimization of the average secure key rate achieved in the presence of a passive eavesdropper, and the optimization is performed using semidefinite programming. While representative results are presented for a uniform linear array, the optimization procedure...

chapter

Junction Optimization for Embedded 40nm FN/FN Flash Memory

Alessandro Baiano, Michiel van Duuren, Erik van der Vegt, Bob Schippers, more

2015 IEEE International Memory Workshop (IMW) > 1 - 4

2015 IEEE International Memory Workshop (IMW)

2-transistor (2T) cell technology used for embedded non-volatile memory (eNVM) has been scaled down to 40nm node. To enable aggressive cell scaling, the array architecture is modified compared to previous generations and the channel length of cell is drastically reduced requiring steep cell junctions, which give rise to new disturb phenomena. This paper describes how to safeguard the drain disturb...

chapter

Specializing Compiler Optimizations through Programmable Composition for Dense Matrix Computations

Qing Yi, Qian Wang, Huimin Cui

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture > 596 - 608

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

General purpose compilers aim to extract the best average performance for all possible user applications. Due to the lack of specializations for different types of computations, compiler attained performance often lags behind those of the manually optimized libraries. In this paper, we demonstrate a new approach, programmable composition, to enable the specialization of compiler optimizations without...

chapter

Semi-automatic Tool to Ease the Creation and Optimization of GPU Programs

Jacob Jepsen

2014 43rd International Conference on Parallel Processing Workshops > 196 - 205

2014 43nd International Conference on Parallel Processing Workshops (ICCPW)

We present a tool that reduces the development time of GPU-executable code. We implement a catalogue of common optimizations specific to the GPU architecture. Through the tool, the programmer can semi-automatically transform a computationally-intensive code section into GPU-executable form and apply optimizations thereto. Based on experiments, the code generated by the tool can be 3-256X faster than...

chapter

Realizing Efficient Execution of Dataflow Actors on Manycores

Essayas Gebrewahid, Mingkun Yang, Gustav Cedersjo, Zain Ul Abdin, more

2014 12th IEEE International Conference on Embedded and Ubiquitous Computing > 321 - 328

2014 12th IEEE International Conference on Embedded and Ubiquitous Computing (EUC)

Embedded DSP computing is currently shifting towards manycore architectures in order to cope with the ever growing computational demands. Actor based dataflow languages are being considered as a programming model. In this paper we present a code generator for CAL, one such dataflow language. We propose to use a compilation tool with two intermediate representations. We start from a machine model of...

chapter

Array synthesis problems via convex relaxation

Benjamin Fuchs

2014 IEEE Antennas and Propagation Society International Symposium (APSURSI) > 1361 - 1362

2014 IEEE International Symposium on Antennas and Propagation & USNC/URSI National Radio Science Meeting

A general procedure, based on the SemiDefinite Relaxation (SDR) technique, is presented to solve efficiently a wide range of difficult (because non-convex) array synthesis problems. This powerful approximation technique is easy to implement and the so-approximated problem can then be efficiently solved using off-the-shelf numerical routines. Examples of shaped beam and reconfigurable array synthesis...

chapter

Locality-aware power optimization and measurement methodology for PGAS workloads on SMP clusters

David K. Newsom, Sardar F. Azari, Ahmad Anbar, Tarek El-Ghazawi

2013 International Green Computing Conference Proceedings > 1 - 10

2013 International Green Computing Conference (IGCC)

Reducing energy consumption without affecting computational performance is a significant research driver in computer engineering. The Partitioned Global Address Space (PGAS) programming model provides a global address space for ease-of-use while providing locality-awareness for efficient execution. For symmetric multiprocessor (SMP) clusters, PGAS locality-awareness offers opportunities for intelligent...

chapter

Automated Rapid Prototyping of Regular Grid-Based Numerical Applications Using Generalized Elemental Subroutines

Yingchong Situ, Ye Wang, Zhiyuan Li

2013 IEEE 27th International Symposium on Parallel and Distributed Processing > 284 - 294

2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Computational scientists and engineers commonly rely on established software libraries to achieve high performance and reliability in their numerical applications. Unfortunately, this approach does not work well if the desired functionality is absent in existing libraries or if the integration is difficult. In such scenarios, one is often forced to explore alternative algorithms and in-house implementations...

chapter

A Transparent Collective I/O Implementation

Yongen Yu, Jingjin Wu, Zhiling Lan, Douglas H. Rudd, more

2013 IEEE 27th International Symposium on Parallel and Distributed Processing > 297 - 307

2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

I/O performance is vital for most HPC applications especially those that generate a vast amount of data with the growth of scale. Many studies have shown that scientific applications tend to issue small and noncontiguous accesses in an interleaving fashion, causing different processes to access overlapping regions. In such scenario, collective I/O is a widely used optimization technique. However,...

chapter

Program disturbs and process optimization in a 65 nm Flash FPGA

James Yingbo Jia, Pavan Singaraju, Habtom Micael, Patty Liu, more

2012 IEEE International Integrated Reliability Workshop Final Report > 117 - 118

2012 IEEE International Integrated Reliability Workshop (IIRW)

We present studies of an extrinsic program disturb mechanism in a Field Programmable Gate Array (FPGA) fabricated with a 65 nm embedded-Flash process. It is concluded that multiple positive charges are involved during disturb to explain the observed extrinsic behavior. Its failure rate was improved with tunnel oxidation process tuning and stronger pre-oxidation cleans.

chapter

A Compiler-Based Tool for Array Analysis in HPC Applications

Ahmad Qawasmeh, Barbara Chapman, Amrita Banerjee

2012 41st International Conference on Parallel Processing Workshops > 454 - 463

2012 41st International Conference on Parallel Processing Workshops (ICPPW)

Array region analysis plays a significant role in various optimizations at compile time. Displaying array access information efficiently in HPC applications has been a vital challenge for scientists and developers for the past few years. Dragon array region analysis tool is a powerful and interactive tool that was built on top of the Open UH compiler, an open source C/C++/Fortran compiler, that supports...

chapter

Can traditional programming bridge the Ninja performance gap for parallel computing applications?

Nadathur Satish, Changkyu Kim, Jatin Chhugani, Hideki Saito, more

2012 39th Annual International Symposium on Computer Architecture (ISCA) > 440 - 451

2012 ACM/IEEE 39th International Symposium on Computer Architecture (ISCA)

Current processor trends of integrating more cores with wider SIMD units, along with a deeper and complex memory hierarchy, have made it increasingly more challenging to extract performance from applications. It is believed by some that traditional approaches to programming do not apply to these modern processors and hence radical new languages must be discovered. In this paper, we question this thinking...

chapter

A Polyhedral Modeling Based Source-to-Source Code Optimization Framework for GPGPU

Chenxi Wang, Kang Kang, Maohua Zhu, Yangdong Deng

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 1964 - 1970

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

In this paper, we propose a source-to-source code optimization framework for general purpose computing on graphics processing units (GPGPU). Our framework is based on a re-formulation of the polyhedral loop transformation theory under the context of GPGPU. We prove that the number of actual memory transactions can be used as a performance metric to guide the code optimization process. In addition,...

article

Phase Only Antenna Pattern Notching Via a Semidefinite Programming Relaxation

Peter J. Kajenski

IEEE Transactions on Antennas and Propagation > 2012 > 60 > 5 > 2562 - 2565

A phase-only method of generating notches in the beam pattern of a phased array antenna is described. It is shown how the problem can be formulated as a semidefinite programming problem, and solved with readily available solvers. Numerical results simulating the performance of a 32 element uniform linear array are presented.

chapter

PACOGEN: Automatic Generation of Pairwise Test Configurations from Feature Models

Aymeric Hervieu, Benoit Baudry, Arnaud Gotlieb

2011 IEEE 22nd International Symposium on Software Reliability Engineering > 120 - 129

2011 IEEE 22nd International Symposium on Software Reliability Engineering (ISSRE)

Feature models are commonly used to specify variability in software product lines. Several tools support feature models for variability management at different steps in the development process. However, tool support for test configuration generation is currently limited. This test generation task consists in systematically selecting a set of configurations that represent a relevant sample of the variability...

Keywords:
ARRAYS
OPTIMIZATION
PROGRAMMING

Publication date

Set your own date range

Publication type

book (35)
article (2)

Keywords

PARALLEL PROCESSING (8)
INDEXES (5)
INSTRUCTION SETS (5)
LIBRARIES (5)
ALGORITHM DESIGN AND ANALYSIS (4)
DATA MINING (4)
HARDWARE (4)
KERNEL (4)
RUNTIME (4)
ANTENNAS (3)
COMPUTATIONAL MODELING (3)
EVOLUTIONARY COMPUTATION (3)
GRAPHICS PROCESSING UNIT (3)
GRAPHICS PROCESSING UNITS (3)
OPTIMISATION (3)
ACCELERATION (2)
ALGORITHMS (2)
ANTENNA ARRAYS (2)
ANTENNA RADIATION PATTERNS (2)
ARRAY SIGNAL PROCESSING (2)
COMPUTE UNIFIED DEVICE ARCHITECTURE (2)
COMPUTER ARCHITECTURE (2)
CORRELATION (2)
CUDA (2)
EDGE DETECTION (2)
EVOLUTIONARY OPTIMIZATION (2)
MICROPROCESSORS (2)
MULTIPROCESSING SYSTEMS (2)
NONLINEAR PROGRAMMING (2)
OPTIMISING COMPILERS (2)
PARALLEL PROGRAMMING (2)
PARTICLE SWARM OPTIMISATION (2)
PARTICLE SWARM OPTIMIZATION (2)
PROGRAM COMPILERS (2)
ROBUSTNESS (2)
SIGNAL TO NOISE RATIO (2)
SIMD (2)
VECTORS (2)
ABSTRACT PROGRAMMING (1)
ABSTRACT SOURCE PROGRAMS (1)
ADAPTIVE ARRAY (1)
ADAPTIVE RADAR (1)
ANALYSIS TOOL (1)
APPROXIMATION METHODS (1)
ARBB (1)
ARBITRARY ARRAY IMPERFECTIONS (1)
ARRAY REGION ANALYSIS (1)
ARTIFICIAL INTELLIGENCE (1)
ASSOCIATE HASH FUNCTIONS (1)
AUTOMATIC PROGRAMMING (1)
AUTOTUNING (1)
AVAILABILITY (1)
BANDWIDTH (1)
BEAMFORMING (1)
BENCHMARK TEST FUNCTIONS (1)
BENCHMARK TESTING (1)
BIOLOGICAL CELLS (1)
BIOLOGICAL SYSTEM MODELING (1)
BRUTE FORCE ATTACK (1)
C++ LANGUAGE (1)
CACHE MEMORY (1)
CAL (1)
CANNY EDGE DETECTION (1)
CCELL (1)
CELL BE ARCHITECTURE (1)
CELL PROGRAMMING (1)
CHALCOGENIDE (1)
CIRCUIT RELIABILITY (1)
CIRCULAR ARRAYS (1)
CODE GENERATION (1)
CODE OPTIMIZATIONS (1)
CODE TRANSFORMATIONS (1)
COLLECTIVE I/O (1)
COMPILATION (1)
COMPILATION FRAMEWORK (1)
COMPILER ANALYSIS AND OPTIMIZATION (1)
COMPILER OPTIMIZATIONS (1)
COMPILER-BASED TOOL (1)
COMPROMISE PROGRAMMING (1)
COMPUTATIONAL COMPLEXITY (1)
COMPUTER GRAPHICS (1)
COMPUTER SCIENCE (1)
COMPUTER VISION (1)
COMPUTERS AND INFORMATION PROCESSING (1)
COMPUTEUNIFIEDDEVICE ARCHITECTURE (1)
CONSTRAINTS ON MAGNITUDE RESPONSE (1)
CONVEX FUNCTIONS (1)
CONVEX OPTIMIZATION (1)
CROSSOVER OPERATOR (1)
CUDA ARCHITECTURE (1)
CYCLING ENDURANCE (1)
DATA LOCALITY-AWARENESS (1)
DATA MODELS (1)
DATAFLOW LANGUAGES (1)
DEBUG APPLICATIONS (1)
DECISION MAKING (1)
DECISION THEORY (1)
more

INFONA - science communication portal

Advanced search

Advanced search

Parallel triangle counting and k-truss identification using graph-centric methods

OpenACC Cache Directive: Opportunities and Optimizations

Exploiting recent SIMD architectural advances for irregular applications

Taguchi method or Compromise Programming as Robust Design optimization tool: The case of a flexible manufacturing system

A Branch-and-Bound Algorithm for Discrete Receive Beamforming with Improved Bounds

Optimal array signaling for key establishment in static multipath channels

Junction Optimization for Embedded 40nm FN/FN Flash Memory

Specializing Compiler Optimizations through Programmable Composition for Dense Matrix Computations

Semi-automatic Tool to Ease the Creation and Optimization of GPU Programs

Realizing Efficient Execution of Dataflow Actors on Manycores

Array synthesis problems via convex relaxation

Locality-aware power optimization and measurement methodology for PGAS workloads on SMP clusters

Automated Rapid Prototyping of Regular Grid-Based Numerical Applications Using Generalized Elemental Subroutines

A Transparent Collective I/O Implementation

Program disturbs and process optimization in a 65 nm Flash FPGA

A Compiler-Based Tool for Array Analysis in HPC Applications

Can traditional programming bridge the Ninja performance gap for parallel computing applications?

A Polyhedral Modeling Based Source-to-Source Code Optimization Framework for GPGPU

Phase Only Antenna Pattern Notching Via a Semidefinite Programming Relaxation

PACOGEN: Automatic Generation of Pairwise Test Configurations from Feature Models

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options