Search results

Items from 1 to 20 out of 529 results

chapter

The mondrian data engine

Mario Drumond, Alexandros Daglis, Nooshin Mirzadeh, Dmitrii Ustiugov, more

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 639 - 651

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

The increasing demand for extracting value out of ever-growing data poses an ongoing challenge to system designers, a task only made trickier by the end of Dennard scaling. As the performance density of traditional CPU-centric architectures stagnates, advancing compute capabilities necessitates novel architectural approaches. Near-memory processing (NMP) architectures are reemerging as promising candidates...

chapter

DICE: Compressing DRAM caches for bandwidth and capacity

Vinson Young, Prashant J. Nair, Moinuddin K. Qureshi

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 627 - 638

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

This paper investigates compression for DRAM caches. As the capacity of DRAM cache is typically large, prior techniques on cache compression, which solely focus on improving cache capacity, provide only a marginal benefit. We show that more performance benefit can be obtained if the compression of the DRAM cache is tailored to provide higher bandwidth. If a DRAM cache can provide two compressed lines...

chapter

Jenga: Software-defined cache hierarchies

Po-An Tsai, Nathan Beckmann, Daniel Sanchez

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 652 - 665

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

Caches are traditionally organized as a rigid hierarchy, with multiple levels of progressively larger and slower memories. Hierarchy allows a simple, fixed design to benefit a wide range of applications, since working sets settle at the smallest (i.e., fastest and most energy-efficient) level they fit in. However, rigid hierarchies also add overheads, because each level adds latency and energy even...

chapter

HeteroOS — OS design for heterogeneous memory management in datacenter

Sudarsun Kannan, Ada Gavrilovska, Vishal Gupta, Karsten Schwan

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 521 - 534

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

Heterogeneous memory management combined with server virtualization in datacenters is expected to increase the software and OS management complexity. State-of-the-art solutions rely exclusively on the hypervisor (VMM) for expensive page hotness tracking and migrations, limiting the benefits from heterogeneity. To address this, we design HeteroOS, a novel application-transparent OS-level solution for...

chapter

A novel deceptive and blanket joint jammer

Qingzhan Shi, Yu Wang, Chao Wang, Qi Feng, more

2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) > 1 - 4

2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC)

In this paper, an advanced joint jammer with deceptive jamming and blanket jamming is proposed for countering linear frequency modulated (LFM) radar. Multiple preceded and hysteretic false targets are produced by convolution operation and the blanket jamming is obtained by modulating the intercepted hostile radar signal according to pseudo-random sequences. The jamming algorithm is implemented on...

chapter

Work as a team or individual: Characterizing the system-level impacts of main memory partitioning

Eojin Lee, Jongwook Chung, Daejin Jung, Sukhan Lee, more

2017 IEEE International Symposium on Workload Characterization (IISWC) > 156 - 166

2017 IEEE International Symposium on Workload Characterization (IISWC)

Modern multi-core systems employ shared memory architecture, entailing problems related to the main memory such as row-buffer conflicts, time-varying hot-spots across memory channels, and superfluous switches between reads and writes originating from different cores. There have been proposals to solve these problems by partitioning main memory across banks and/or channels such that a DRAM bank is...

chapter

Congestion-aware memory management on NUMA platforms: A VMware ESXi case study

Jagadish B. Kotra, Seongbeom Kim, Kamesh Madduri, Mahmut T. Kandemir

2017 IEEE International Symposium on Workload Characterization (IISWC) > 146 - 155

2017 IEEE International Symposium on Workload Characterization (IISWC)

He VMware ESXi hypervisor attracts a wide range of customers and is deployed in domains ranging from desktop computing to server computing. While the software systems are increasingly moving towards consolidation, hardware has already transitioned into multi-socket Non-Uniform Memory Access (NUMA)-based systems. The marriage of increasing consolidation and the multi-socket based systems warrants low-overhead,...

chapter

Understanding power-performance relationship of energy-efficient modern DRAM devices

Sukhan Lee, Yuhwan Ro, Young Hoon Son, Hyunyoon Cho, more

2017 IEEE International Symposium on Workload Characterization (IISWC) > 110 - 111

2017 IEEE International Symposium on Workload Characterization (IISWC)

As servers are equipped with more memory modules each with larger capacity, main-memory systems are now the second highest energy-consuming component in big-memory servers and their energy consumption even becomes comparable to processors in some servers. Meanwhile, it is critical for big-memory servers and their main-memory systems to offer high energy efficiency. Prior work exploited mobile LPDDR...

chapter

Demystifying the characteristics of 3D-stacked memories: A case study for Hybrid Memory Cube

Ramyad Hadidi, Bahar Asgari, Burhan Ahmad Mudassar, Saibal Mukhopadhyay, more

2017 IEEE International Symposium on Workload Characterization (IISWC) > 66 - 75

2017 IEEE International Symposium on Workload Characterization (IISWC)

Three-dimensional (3D)-stacking technology, which enables the integration of DRAM and logic dies, offers high bandwidth and low energy consumption. This technology also empowers new memory designs for executing tasks not traditionally associated with memories. A practical 3D-stacked memory is Hybrid Memory Cube (HMC), which provides significant access bandwidth and low power consumption in a small...

chapter

SoftRing: Taming the reactive model for software defined networks

Chengchen Hu, Kaiyu Hou, Hao Li, Ruilong Wang, more

2017 IEEE 25th International Conference on Network Protocols (ICNP) > 1 - 10

2017 IEEE 25th International Conference on Network Protocols (ICNP)

The reactive model of Software Defined Networking (SDN) invokes controller to dynamically determine the behaviors of a new flow without any pre-knowledge in the data plane. However, the reactive events raised by such flexible model meanwhile consume lots of the bottleneck resources of the fast memory in switch and bandwidth between controller and switches. To address this problem, we propose SoftRing...

chapter

Quantifying and mitigating the costs of FPGA virtualization

Sadegh Yazdanshenas, Vaughn Betz

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 7

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

FPGAs are being incorporated into contemporary datacenters in order to improve computational capacity, power consumption, and processing latency. Efficiently integrating FP-GAs in datacenters is, however, quite challenging. Ideally, smaller tasks could share a device and the cloud management layer would be able to partially reconfigure the device to allocate its free resources to incoming tasks. Moreover,...

chapter

Algorithm-Directed Crash Consistence in Non-volatile Memory for HPC

Shuo Yang, Kai Wu, Yifan Qiao, Dong Li, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 475 - 486

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Fault tolerance is one of the major design goals for HPC. The emergence of non-volatile memories (NVM) provides a solution to build fault tolerant HPC. Data in NVM-based main memory are not lost when the system crashes because of the non-volatility nature of NVM. However, because of volatile caches, data must be logged and explicitly flushed from caches into NVM to ensure consistence and correctness...

chapter

A Multi-objective Heuristic for the Optimization of Virtual Network Function Chain Placement

Stanislav Lange, Alexej Grigorjew, Thomas Zinner, Phuoc Tran-Gia, more

2017 29th International Teletraffic Congress (ITC 29) > 1 > 152 - 160

2017 29th International Teletraffic Congress (ITC 29)

The Network Functions Virtualization (NFV) paradigm offers network operators benefits in terms of cost efficiency, vendor independence, as well as flexibility and scalability. However, in order to profit most from these features, new challenges in the area of management and orchestration of the virtual network functions (VNFs) need to be addressed.,,In particular, this work deals with the VNF chain...

chapter

Runtime Techniques for Programming with Fast and Slow Memory

Xiang Ni, Nikhil Jain, Kavitha Chandrasekar, Laxmikant V. Kale

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 147 - 151

2017 IEEE International Conference on Cluster Computing (CLUSTER)

The increase in memory capacity is substantially behind the increase in computing power in today's supercomputers. In order to alleviate the effect of this gap, diverse options such as NVM - non-volatile memory (less expensive but slow) and HBM - high bandwidth memory (fast but expensive) are being explored. In this paper, we present a common approach using parallel runtime techniques for utilizing...

chapter

Emerging memory technologies for high density applications

Giorgio Servalli

2017 47th European Solid-State Device Research Conference (ESSDERC) > 156 - 159

ESSDERC 2017 - 47th IEEE European Solid-State Device Research Conference (ESSDERC)

Comparison of most mature and promising emerging memory technologies respect to mainstream NAND and DRAM and challenges for the introduction in the market for high density applications.

chapter

Automated generation of banked memory architectures in the high-level synthesis of multi-threaded software

Yu Ting Chen, Jason H. Anderson

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 8

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Some modern high-level synthesis (HLS) tools [1] permit the synthesis of multi-threaded software into parallel hardware, where concurrent software threads are realized as concurrently operating hardware units. A common performance bottleneck in any parallel implementation (whether it be hardware or software) is memory bandwidth — parallel threads demand concurrent access to memory resulting in contention...

chapter

HPC on FPGA clouds: 3D FFTs and implications for molecular dynamics

Jiayi Sheng, Chen Yang, Ahmed Sanaullah, Michael Papamichael, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

The architecture of the Microsoft Catapult II cloud places the accelerator (FPGA) as a bump-in-the-wire on the way to the network and thus promises a dramatic reduction in latency as layers of hardware and software are avoided. We demonstrate this capability with an implementation of the 3D FFT. Next we examine phased application elasticity, i.e., the use of a reduced set of nodes for some phases...

chapter

SELF: A High Performance and Bandwidth Efficient Approach to Exploiting Die-Stacked DRAM as Part of Memory

Yuhua Guo, Qing Liu, Weijun Xiao, Ping Huang, more

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS) > 187 - 197

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)

Die-stacked DRAM (a.k.a., on-chip DRAM) provides much higher bandwidth and lower latency than off-chip DRAM. It is a promising technology to break the "memory wall". Die-stacked DRAM can be used either as a cache (i.e., DRAM cache) or as a part of memory (PoM). A DRAM cache design would suffer from more page faults than a PoM design as the DRAM cache cannot contribute towards capacity of...

chapter

OSCAR: Optimizing SCrAtchpad reuse for graph processing

Shreyas G. Singapura, Ajitesh Srivastava, Rajgopal Kannan, Viktor K. Prasanna

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2017 IEEE High Performance Extreme Computing Conference (HPEC)

Recently, architectures with scratchpad memory are gaining popularity. These architectures consist of low bandwidth, large capacity DRAM and high bandwidth, user addressable small capacity scratchpad. Existing algorithms must be redesigned to take advantage of the high bandwidth while overcoming the constraint on capacity of scratchpad. In this paper, we propose an optimized edge-centric graph processing...

chapter

Simulation methods for load balancing in distributed computing

Igor Ivanisenko, Maksym Volk

2017 IEEE East-West Design & Test Symposium (EWDTS) > 1 - 6

2017 IEEE East-West Design & Test Symposium (EWDTS)

The paper proposes the solution of the actual scientific problem of load balancing and efficient use of distributed system resources. The proposed method is based on the calculation of the load of the central processor, memory and bandwidth fractal information streams of different classes of service for each server and the entire distributed system. The method allows calculating the imbalance of all...

Keywords:
BANDWIDTH
RANDOM ACCESS MEMORY

Publication date

Set your own date range

Keywords

MEMORY MANAGEMENT (145)
COMPUTER ARCHITECTURE (70)
HARDWARE (62)
SERVERS (51)
DRAM CHIPS (49)
FIELD PROGRAMMABLE GATE ARRAYS (45)
BENCHMARK TESTING (43)
PROGRAM PROCESSORS (43)
SYSTEM-ON-CHIP (43)
CLOCKS (41)
PERFORMANCE EVALUATION (39)
PARALLEL PROCESSING (36)
RESOURCE MANAGEMENT (36)
ARRAYS (35)
THROUGHPUT (35)
NONVOLATILE MEMORY (32)
SYSTEM-ON-A-CHIP (32)
OPTIMIZATION (31)
MEMORY ARCHITECTURE (30)
POWER DEMAND (30)
COMPUTATIONAL MODELING (29)
GRAPHICS PROCESSING UNITS (29)
STREAMING MEDIA (29)
VIDEO CODING (29)
INSTRUCTION SETS (28)
PROTOCOLS (28)
DECODING (26)
SOFTWARE (26)
DELAY (25)
ENCODING (25)
KERNEL (25)
REGISTERS (25)
ALGORITHM DESIGN AND ANALYSIS (24)
COMPUTERS (24)
FPGA (23)
THREE-DIMENSIONAL DISPLAYS (23)
TIMING (23)
MICROPROCESSORS (22)
MULTICORE PROCESSING (22)
ORGANIZATIONS (22)
ROUTING (22)
THROUGH-SILICON VIAS (22)
DRAM (21)
INTERFERENCE (20)
LOGIC GATES (20)
SWITCHES (20)
COMPLEXITY THEORY (19)
MOBILE COMMUNICATION (19)
PREFETCHING (19)
CLOUD COMPUTING (18)
DATA MINING (18)
BUFFER STORAGE (17)
MULTIPROCESSING SYSTEMS (17)
PIPELINES (17)
QUALITY OF SERVICE (17)
MATHEMATICAL MODEL (16)
MEMORY (16)
THREE DIMENSIONAL DISPLAYS (16)
INTEGRATED CIRCUIT INTERCONNECTIONS (15)
PROCESSOR SCHEDULING (15)
RELIABILITY (15)
SIGNAL PROCESSING (15)
SILICON (15)
TESTING (15)
VIRTUAL MACHINING (15)
DATA MODELS (14)
DELAYS (14)
LINUX (14)
MOTION ESTIMATION (14)
STORAGE MANAGEMENT (14)
TOPOLOGY (14)
CACHE STORAGE (13)
DEGRADATION (13)
EMBEDDED SYSTEMS (13)
PROCESS CONTROL (13)
SIMULATION (13)
STANDARDS (13)
SYNCHRONIZATION (13)
CMOS INTEGRATED CIRCUITS (12)
CONFERENCES (12)
ENGINES (12)
HEURISTIC ALGORITHMS (12)
INTERNET (12)
IP NETWORKS (12)
MEMORY BANDWIDTH (12)
SCHEDULING (12)
STACKING (12)
DATA TRANSFER (11)
IMAGE PROCESSING (11)
INDEXES (11)
INTEGRATED CIRCUIT DESIGN (11)
MICROPROCESSOR CHIPS (11)
MULTIMEDIA COMMUNICATION (11)
NETWORK-ON-CHIP (11)
PROPOSALS (11)
RADIATION DETECTORS (11)
SCALABILITY (11)
SIGNAL PROCESSING ALGORITHMS (11)
more

INFONA - science communication portal

Search results

The mondrian data engine

DICE: Compressing DRAM caches for bandwidth and capacity

Jenga: Software-defined cache hierarchies

HeteroOS — OS design for heterogeneous memory management in datacenter

A novel deceptive and blanket joint jammer

Work as a team or individual: Characterizing the system-level impacts of main memory partitioning

Congestion-aware memory management on NUMA platforms: A VMware ESXi case study

Understanding power-performance relationship of energy-efficient modern DRAM devices

Demystifying the characteristics of 3D-stacked memories: A case study for Hybrid Memory Cube

SoftRing: Taming the reactive model for software defined networks

Quantifying and mitigating the costs of FPGA virtualization

Algorithm-Directed Crash Consistence in Non-volatile Memory for HPC

A Multi-objective Heuristic for the Optimization of Virtual Network Function Chain Placement

Runtime Techniques for Programming with Fast and Slow Memory

Emerging memory technologies for high density applications

Automated generation of banked memory architectures in the high-level synthesis of multi-threaded software

HPC on FPGA clouds: 3D FFTs and implications for molecular dynamics

SELF: A High Performance and Bandwidth Efficient Approach to Exploiting Die-Stacked DRAM as Part of Memory

OSCAR: Optimizing SCrAtchpad reuse for graph processing

Simulation methods for load balancing in distributed computing

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options