Search results

chapter

Hybrid.poly: An Interactive Large-Scale In-memory Analytical Polystore

Maksim Podkorytov, Dylan Soderman, Michael Gubanov

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 43 - 50

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Anecdotal evidence suggests that the variety of Big data is one of the most challenging problems in Computer Science research today [Stonebraker, 2012], [Ou et al., 2017], [Guo et al., 2016], [Bai et al., 2016]. First, Big data comes at us from a myriad of data sources, hence its shape and flavor differ. Second, hundreds of data management systems which work with Big data support different APIs and...

chapter

Compression experiments on term-document index

Murat Cihan Sorkun, Can Ozbey

2017 International Conference on Computer Science and Engineering (UBMK) > 435 - 439

2017 International Conference on Computer Science and Engineering (UBMK)

The increase in the size of the data used in natural language processing activities brings with it time and space constraints. Thus, it is important to both store and access data efficiently. This study includes experiments for storing the term-document index, which will be used in a natural language processing project, effectively in memory. For this purpose, the indexed data is compressed using...

chapter

Body bias optimization for variable pipelined CGRA

Takuya Kojima, Naoki Ando, Hayate Okuhara, Ng. Anh Vu Doan, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Variable Pipeline Cool Mega Array (VPCMA) is an low power Coarse Grained Reconfigurable Architecture (CGRA) based on the concept of CMA (Cool Mega Array). It implements a pipeline structure that can be configured depending on performance requirements, and the silicon on thin buried oxide (SOTB) technology that allows to control its body bias voltage to balance performance and leakage power. In this...

chapter

Performance Modeling for Optimal Data Placement on GPU with Heterogeneous Memory Systems

Yingchao Huang, Dong Li

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 166 - 177

2017 IEEE International Conference on Cluster Computing (CLUSTER)

A heterogeneous memory system (HMS) consists of multiple memory components with different properties. GPU is a representative architecture with HMS. It is challenging to decide optimal placement of data objects on HMS because of the large exploration space and complicated memory hierarchy on HMS. In this paper, we introduce performance modeling techniques to predict performance of various data placements...

chapter

Parallel triangle counting and k-truss identification using graph-centric methods

Chad Voegele, Yi-Shan Lu, Sreepathi Pai, Keshav Pingali

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2017 IEEE High Performance Extreme Computing Conference (HPEC)

We describe CPU and GPU implementations of parallel triangle-counting and k-truss identification in the Galois and IrGL systems. Both systems are based on a graph-centric abstraction called the operator formulation of algorithms. Depending on the input graph, our implementations are two to three orders of magnitude faster than the reference implementations provided by the IEEE HPEC static graph challenge.

chapter

Genetic algorithm optimization applied to the project of MIMO systems

Israel A. C. Leal, Marcelo S. Alencar, Waslon Terllizzie A. Lopes

2017 25th International Conference on Software, Telecommunications and Computer Networks (SoftCOM) > 1 - 5

2017 25th International Conference on Software, Telecommunications and Computer Networks (SoftCOM)

This paper presents a technique to increase the data throughput in Multiple Input Multiple Output (MIMO) systems. The technique uses a meta-heuristic to optimize the data throughput and to choose the best solution. The optimization is based on Genetic Algorithms (GA), with the objective of finding out the best antenna configuration to achieve the highest data throughput from the variation of the distance...

chapter

Parameter identification of photovoltaic cell based on improved recursive least square method

Yan Xu, Weijia Jin, Xiaorong Zhu

2017 20th International Conference on Electrical Machines and Systems (ICEMS) > 1 - 5

2017 20th International Conference on Electrical Machines and Systems (ICEMS)

The photovoltaic (PV) array is one of the main components of the PV system, and the accuracy of the PV array model is directly related to the validity of the simulation results. The parameters of the PV array may change with the operation conditions. Therefore, it is important to identify the parameters of the PV array model according to the measured data. In this paper, the conventional four-parameter...

chapter

A mail based recommender system

G. Kaushik Ram, N. Sai Kiran, S. Sudha

2017 IEEE International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM) > 75 - 81

2017 IEEE International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM)

The objective of the work is to provide a recommender system through optimization. The users place orders to suppliers through email, requesting preferences based on cost and delivery date. Optimization algorithms are formulated to provide an optimal mix of the products to the users based on the cost and speed of delivery. The algorithm takes the availability and cost of the products with the suppliers...

chapter

Performance Optimisation of Smoothed Particle Hydrodynamics Algorithms for Multi/Many-Core Architectures

Fabio Baruffa, Luigi Iapichino, Nicolay J. Hammer, Vasileios Karakasis

2017 International Conference on High Performance Computing & Simulation (HPCS) > 381 - 388

2017 International Conference on High Performance Computing & Simulation (HPCS)

We describe a strategy for code modernisation of Gadget, a widely used community code for computational astrophysics. The focus of this work is on node-level performance optimisation, targeting current multi/many-core Intel® architectures. We identify and isolate a sample code kernel, which is representative of a typical Smoothed Particle Hydrodynamics (SPH) algorithm. The code modifications include...

chapter

Efficient Data Structures for a Hybrid Parallel and Vectorized Particle-in-Cell Code

Yann Barsamian, Sever A. Hirstoaga, Eric Violard

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1168 - 1177

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

The contribution of the present work relies on an innovative and judicious combination of several optimization techniques for achieving high performance when using automatic vectorization and hybrid MPI/OpenMP parallelism in a Particle-in-Cell (PIC) code. The domain of application is plasma physics: the code simulates 2d2v Vlasov-Poisson systems on Cartesian grids with periodic boundary conditions...

chapter

Use of Synthetic Benchmarks for Machine-Learning-Based Performance Auto-Tuning

Tianyi David Han, Tarek S. Abdelrahman

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1350 - 1361

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

We explore the use of synthetic benchmarks for the training phase of machine-learning-based automatic performance tuning. We focus on the problem of predicting if the use of local memory on a GPU is beneficial for caching a single target array in a GPU kernel. We show that the use of only 13 real benchmarks leads to poor prediction accuracy (about to 58%) of the 13 leave-one-out models trained using...

chapter

Portable high-performance software design using templated meta-programming for EM calculations

Jamie Infantolino, James Ross, David Richie

2017 International Applied Computational Electromagnetics Society Symposium - Italy (ACES) > 1 - 2

2017 International Applied Computational Electromagnetics Society Symposium - Italy (ACES)

The Finite Difference Time Domain (FDTD) Method is used for full-wave electromagnetic (EM) simulations. FDTD is computationally intensive with performance depending critically on architecture-specific optimizations that have become more challenging given the rapidly changing architectures in modern high-performance computing platforms. We examine a templated meta-programming technique to implement...

chapter

Online failure detection in large massive MIMO linear arrays

Daniele Finchera, Marco Donald Migliore, Mario Lucido, Fulvio Schettino, more

2017 International Applied Computational Electromagnetics Society Symposium - Italy (ACES) > 1 - 2

2017 International Applied Computational Electromagnetics Society Symposium - Italy (ACES)

In this paper we discuss the feasibility of an online failure detection algorithm for large linear arrays for massive MIMO applications. By means of a numerical optimization of the position of few near field probes, a good failure detection performance has been verified.

chapter

Redundancy elimination revisited

Keith Cooper, Jason Eckhardt, Ken Kennedy

2008 International Conference on Parallel Architectures and Compilation Techniques (PACT) > 12 - 21

2008 International Conference on Parallel Architectures and Compilation Techniques (PACT)

This work proposes and evaluates improvements to previously known algorithms for redundancy elimination.

chapter

LIFT: A functional data-parallel IR for high-performance GPU code generation

Michel Steuwer, Toomas Remmelg, Christophe Dubach

2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 74 - 85

2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Parallel patterns (e.g., map, reduce) have gained traction as an abstraction for targeting parallel accelerators and are a promising answer to the performance portability problem. However, compiling high-level programs into efficient low-level parallel code is challenging. Current approaches start from a high-level parallel IR and proceed to emit GPU code directly in one big step. Fixed strategies...

chapter

A Split Cache Hierarchy for Enabling Data-Oriented Optimizations

Andreas Sembrant, Erik Hagersten, David Black-Schaffer

2017 IEEE International Symposium on High Performance Computer Architecture (HPCA) > 133 - 144

2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)

Today's caches tightly couple data with metadata (Address Tags) at the cache line granularity. The co-location of data and its identifying metadata means that they require multiple approaches to locate data (associative way searches and level-by-level searches), evict data (coherent writebacks buffers and associative level-by-level searches) and keep data coherent (directory indirections and associative...

chapter

Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs

Xuechao Wei, Cody Hao Yu, Peng Zhang, Youxiang Chen, more

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC) > 1 - 6

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC)

Convolutional neural networks (CNNs) have been widely applied in many deep learning applications. In recent years, the FPGA implementation for CNNs has attracted much attention because of its high performance and energy efficiency. However, existing implementations have difficulty to fully leverage the computation power of the latest FPGAs. In this paper we implement CNN on an FPGA using a systolic...

chapter

Optimal design of photovoltaic power system for a residential load

Hardeep Singh, Daljeet Kaur, P. S. Cheema

2017 International Conference on Inventive Systems and Control (ICISC) > 1 - 4

2017 International Conference on Inventive Systems and Control (ICISC)

In most developing countries, for the proper utilization of solar energy the system designed should be optimized. This paper presents the design and analysis of photovoltaic (PV) system to supply electricity. A stand-alone optimization simulation model is developed using the HOMER software. The radiation data for Baru Sahib, H.P. is collected for the every month of the year. Simulation model is used...

chapter

To use or not to use: CPUs' cache optimization techniques on GPGPUs

D.R.V.L.B. Thambawita, Roshan G. Ragel, Dhammike Elkaduwe

2016 IEEE International Conference on Information and Automation for Sustainability (ICIAfS) > 1 - 6

2016 IEEE International Conference on Information and Automation for Sustainability (ICIAfS)

General Purpose Graphic Processing Unit(GPGPU) is used widely for achieving high performance or high throughput in parallel programming. This capability of GPGPUs is very famous in the new era and mostly used for scientific computing which requires more processing power than normal personal computers. Therefore, most of the programmers, researchers and industry use this new concept for their work...

chapter

Level-Synchronous BFS Algorithm Implemented in Java Using PCJ Library

Magdalena Ryczkowska, Marek Nowicki, Piotr Bala

2016 International Conference on Computational Science and Computational Intelligence (CSCI) > 596 - 601

2016 International Conference on Computational Science and Computational Intelligence (CSCI)

Graph processing is used in many fields of science such as sociology, risk prediction or biology. Although analysis of graphs is important it also poses numerous challenges especially for large graphs which have to be processed on multicore systems. In this paper, we present PGAS (Partitioned Global Address Space) version of the level-synchronous BFS (Breadth First Search) algorithm and its implementation...

INFONA - science communication portal

Search results

Hybrid.poly: An Interactive Large-Scale In-memory Analytical Polystore

Compression experiments on term-document index

Body bias optimization for variable pipelined CGRA

Performance Modeling for Optimal Data Placement on GPU with Heterogeneous Memory Systems

Parallel triangle counting and k-truss identification using graph-centric methods

Genetic algorithm optimization applied to the project of MIMO systems

Parameter identification of photovoltaic cell based on improved recursive least square method

A mail based recommender system

Performance Optimisation of Smoothed Particle Hydrodynamics Algorithms for Multi/Many-Core Architectures

Efficient Data Structures for a Hybrid Parallel and Vectorized Particle-in-Cell Code

Use of Synthetic Benchmarks for Machine-Learning-Based Performance Auto-Tuning

Portable high-performance software design using templated meta-programming for EM calculations

Online failure detection in large massive MIMO linear arrays

Redundancy elimination revisited

LIFT: A functional data-parallel IR for high-performance GPU code generation

A Split Cache Hierarchy for Enabling Data-Oriented Optimizations

Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs

Optimal design of photovoltaic power system for a residential load

To use or not to use: CPUs' cache optimization techniques on GPGPUs

Level-Synchronous BFS Algorithm Implemented in Java Using PCJ Library

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options