Search results

chapter

APPROX-NoC: A data approximation framework for Network-on-Chip architectures

Rahul Boyapati, Jiayi Huang, Pritam Majumder, Ki Hwan Yum, more

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 666 - 677

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

The trend of unsustainable power consumption and large memory bandwidth demands in massively parallel multicore systems, with the advent of the big data era, has brought upon the onset of alternate computation paradigms utilizing heterogeneity, specialization, processor-in-memory and approximation. Approximate Computing is being touted as a viable solution for high performance computation by relaxing...

chapter

The mondrian data engine

Mario Drumond, Alexandros Daglis, Nooshin Mirzadeh, Dmitrii Ustiugov, more

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 639 - 651

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

The increasing demand for extracting value out of ever-growing data poses an ongoing challenge to system designers, a task only made trickier by the end of Dennard scaling. As the performance density of traditional CPU-centric architectures stagnates, advancing compute capabilities necessitates novel architectural approaches. Near-memory processing (NMP) architectures are reemerging as promising candidates...

chapter

HeteroOS — OS design for heterogeneous memory management in datacenter

Sudarsun Kannan, Ada Gavrilovska, Vishal Gupta, Karsten Schwan

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 521 - 534

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

Heterogeneous memory management combined with server virtualization in datacenters is expected to increase the software and OS management complexity. State-of-the-art solutions rely exclusively on the hypervisor (VMM) for expensive page hotness tracking and migrations, limiting the benefits from heterogeneity. To address this, we design HeteroOS, a novel application-transparent OS-level solution for...

chapter

Fast Binary Descriptor Search for Keypoint Matching by Norm Ordering

Masahiko Sugimura, Takayuki Baba, Ryuta Tanaka

2017 IEEE International Symposium on Multimedia (ISM) > 561 - 566

2017 IEEE International Symposium on Multimedia (ISM)

Keypoint matching between images is an important technique for computer vision applications such as image retrieval. Although binary feature descriptors such as BRIEF enable fast measurement of distance, exhaustive search is still time-consuming. Hashing methods such as Locality Sensitive Hashing (LSH), while being effective to accelerate searching, result in large memory consumption and thus are...

chapter

Rebooting the Data Access Hierarchy of Computing Systems

Wen-mei W. Hwu, Izzat El Hajj, Simon Garcia de Gonzalo, Carl Pearson, more

2017 IEEE International Conference on Rebooting Computing (ICRC) > 1 - 4

2017 IEEE International Conference on Rebooting Computing (ICRC)

We have been experiencing two very important movements in computing. On the one hand, a tremendous amount of resource has been invested into innovative applications such as first-principle-based methods, deep learning and cognitive computing. On the other hand, the industry has been taking a technological path where application performance and energy efficiency vary by more than two orders of magnitude...

chapter

Design of Safety PLC Execution Unit Based on Redundancy Structure of Heterogeneous Dual-Processor

Yue Ma, Mingshi Li, Zhenyu Yin, Mengjia Lian

2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA) > 364 - 368

2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA)

Traditional PLC system in the event of system failure, there is no perfect mechanism to ensure the safety of control, the safety PLC based on heterogeneous dual-processor redundant structure can meet the requirements of industrial control for equipment and personnel, improve the reliability of industrial control. In this paper, according to the operation requirements of safety PLC based on heterogeneous...

chapter

Cross-layer refresh mitigation for efficient and reliable DRAM systems: A comparative study

Xiaoan Ding, Xi Liang, Yanjing Li

2017 IEEE International Test Conference (ITC) > 1 - 10

2017 IEEE International Test Conference (ITC)

DRAM is a crucial component in computing systems, and is expected to be even more important as data-intensive applications become more prominent. A key challenge in advancing DRAM technology is the growing cost of refresh operations, which can impose a large impact on the energy efficiency of DRAM modules. Existing refresh mitigation techniques all require hardware modifications, which may be undesirable...

chapter

Side-channels beyond the cloud edge: New isolation threats and solutions

Mohammad-Mahdi Bazm, Marc Lacoste, Mario Sudholt, Jean-Marc Menaud

2017 1st Cyber Security in Networking Conference (CSNet) > 1 - 8

2017 1st Cyber Security in Networking Conference (CSNet)

Fog and edge computing leverage resources of end users and edge devices rather than centralized clouds. Isolation is a core security challenge for such paradigms: just like traditional clouds, fog and edge infrastructures are based on virtualization to share physical resources among several self-contained execution environments like virtual machines and containers. Yet, isolation may be threatened...

chapter

Evaluating irregular memory access on OpenCL FPGA platforms: A case study with XSBench

Yingyi Luo, Xianshan Wen, Kazutomo Yoshii, Seda Ogrenci-Memik, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

FPGAs are becoming an attractive choice as a heterogeneous computing unit for scientific computing because FPGA vendors are adding floating-point-optimized architectures to their product lines. Additionally, high-level synthesis (HLS) tools such as Altera OpenCL SDK are emerging, which could potentially break the FPGA programming wall and provide a streamlined flow for domain experts in scientific...

chapter

One size does not fit all: Implementation trade-offs for iterative stencil computations on FPGAs

Gael Deest, Tomofumi Yuki, Sanjay Rajopadhye, Steven Derrien

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 8

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Iterative stencils are kernels in various application domains such as numerical simulations and medical imaging, that merit FPGA acceleration. The best architecture depends on many factors such as the target platform, off-chip memory bandwidth, problem size, and performance requirements. We generate a family of FPGA stencil accelerators targeting emerging System on Chip platforms, (e.g., Xilinx Zynq...

chapter

Mapping of P4 match action tables to FPGA

Michal Kekely, Jan Korenek

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 2

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Current networks are changing very fast. Network administrators need more flexible and powerful tools to be able to support new protocols or services very fast. The P4 language provides new level of abstraction for flexible packet processing. Therefore, we have designed new architecture for memory efficient mapping of P4 match/action tables to FPGA. The architecture is based on DCFL algorithm and...

chapter

Analyzing Hybrid Transactional Memory Performance Using Intel SDE

Mohammad A. Qayum, Abdel-Hameed A. Badawy, Jeanine Cook

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 627 - 628

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Due to the rapidly increasing use of big data, machines are stressed to provide more computing power at higher energy efficiency while maintaining simpler and more scalable computing paradigms. Transactional Memory (TM) is one such technique that can be used for synchronization instead of conventional locks used in critical sections since it has simpler paradigms, is scalable and has better energy...

chapter

A Novel Hybrid Transactional Memory Based on Abort Prediction and Adaptive Retry Policy

Young-Sung Shin, Yeon-Woo Jang, Moon-Hwan Kang, Jae-Woo Chang

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 613 - 614

2017 IEEE International Conference on Cluster Computing (CLUSTER)

This paper proposes a novel hybrid transactional memory scheme based on both abort prediction and an adaptive retry policy. First, the proposed scheme can predict not only conflicts between transactions running concurrently, but also the capacity and other aborts of transactions by collecting the information of previously executed transactions. Second, the proposed scheme can provide an adaptive retry...

chapter

OMBM: Optimized Memory Bandwidth Management for Ensuring QoS and High Server Utilization

Hanul Sung, Jeesoo Min, Sujin Ha, Hyeonsang Eom

2017 IEEE 2nd International Workshops on Foundations and Applications of Self* Systems (FAS*W) > 269 - 276

2017 IEEE 2nd International Workshops on Foundations and Applications of Self* Systems (FAS*W)

Latency-critical workloads such as web search engines, social networks and finance market applications are sensitive to tail latencies for meeting Service Level Objectives (SLOs). Since unexpected tail latencies are caused by sharing hardware resources with other co-executing workloads, a service provider executes the latency-critical workload alone. Thus, the data center for the latency-critical...

chapter

ConVGPU: GPU Management Middleware in Container Based Virtualized Environment

Daeyoun Kang, Tae Joon Jun, Dohyeun Kim, Jaewook Kim, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 301 - 309

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Nowadays, Graphics Processing Unit (GPU) is essential for general-purpose high-performance computing, because of its dominant performance in parallel computing compare to that of CPU. There have been many successful trials on the use of GPU in virtualized environment. Especially, NVIDIA Docker obtained a most practical way to bring GPU into the container-based virtualized environment. However, most...

chapter

Evaluating the Viability of Using Compression to Mitigate Silent Corruption of Read-Mostly Application Data

Scott Levy, Kurt B. Ferreira, Patrick G. Bridges

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 603 - 607

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Aggregating millions of hardware components to construct an exascale computing platform will pose significant resilience challenges. In addition to slowdowns associated with detected errors, silent errors are likely to further degrade application performance. Moreover, silent data corruption (SDC) has the potential to undermine the integrity of the results produced by important scientific applications...

chapter

Utility-Based Hybrid Memory Management

Yang Li, Saugata Ghose, Jongmoo Choi, Jin Sun, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 152 - 165

2017 IEEE International Conference on Cluster Computing (CLUSTER)

While the memory footprints of cloud and HPC applications continue to increase, fundamental issues with DRAM scaling are likely to prevent traditional main memory systems, composed of monolithic DRAM, from greatly growing in capacity. Hybrid memory systems can mitigate the scaling limitations of monolithic DRAM by pairing together multiple memory technologies (e.g., different types of DRAM, or DRAM...

chapter

Application-Based Fault Tolerance Techniques for Fully Protecting Sparse Matrix Solvers

Grzegorz Pawelczak, Simon McIntosh-Smith, James Price, Matt Martineau

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 733 - 740

2017 IEEE International Conference on Cluster Computing (CLUSTER)

The continuous growth of high-performance computing (HPC) systems has lead to Fault Tolerance (FT) being identified as one of the major challenges for exascale computing, due to the expected decrease in Mean Time Between Failures (MTBF). One source of faults are soft errors, which can cause bit corruptions to the data held in memory. Current solutions for protection against these errors include hardware...

chapter

SELF: A High Performance and Bandwidth Efficient Approach to Exploiting Die-Stacked DRAM as Part of Memory

Yuhua Guo, Qing Liu, Weijun Xiao, Ping Huang, more

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS) > 187 - 197

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)

Die-stacked DRAM (a.k.a., on-chip DRAM) provides much higher bandwidth and lower latency than off-chip DRAM. It is a promising technology to break the "memory wall". Die-stacked DRAM can be used either as a cache (i.e., DRAM cache) or as a part of memory (PoM). A DRAM cache design would suffer from more page faults than a PoM design as the DRAM cache cannot contribute towards capacity of...

chapter

Sparse matrix assembly on the GPU through multiplication patterns

Rhaleb Zayer, Markus Steinberger, Hans-Peter Seidel

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 8

2017 IEEE High Performance Extreme Computing Conference (HPEC)

The numerical treatment of variational problems gives rise to large sparse matrices, which are typically assembled by coalescing elementary contributions. As the explicit matrix form is required by numerical solvers, the assembly step can be a potential bottleneck, especially in implicit and time dependent settings where considerable updates are needed. On standard HPC platforms, this process can...

INFONA - science communication portal

Search results

APPROX-NoC: A data approximation framework for Network-on-Chip architectures

The mondrian data engine

HeteroOS — OS design for heterogeneous memory management in datacenter

Fast Binary Descriptor Search for Keypoint Matching by Norm Ordering

Rebooting the Data Access Hierarchy of Computing Systems

Design of Safety PLC Execution Unit Based on Redundancy Structure of Heterogeneous Dual-Processor

Cross-layer refresh mitigation for efficient and reliable DRAM systems: A comparative study

Side-channels beyond the cloud edge: New isolation threats and solutions

Evaluating irregular memory access on OpenCL FPGA platforms: A case study with XSBench

One size does not fit all: Implementation trade-offs for iterative stencil computations on FPGAs

Mapping of P4 match action tables to FPGA

Analyzing Hybrid Transactional Memory Performance Using Intel SDE

A Novel Hybrid Transactional Memory Based on Abort Prediction and Adaptive Retry Policy

OMBM: Optimized Memory Bandwidth Management for Ensuring QoS and High Server Utilization

ConVGPU: GPU Management Middleware in Container Based Virtualized Environment

Evaluating the Viability of Using Compression to Mitigate Silent Corruption of Read-Mostly Application Data

Utility-Based Hybrid Memory Management

Application-Based Fault Tolerance Techniques for Fully Protecting Sparse Matrix Solvers

SELF: A High Performance and Bandwidth Efficient Approach to Exploiting Die-Stacked DRAM as Part of Memory

Sparse matrix assembly on the GPU through multiplication patterns

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options