Search results

chapter

Architectural support for server-side PHP processing

Dibakar Gope, David J. Schlais, Mikko H. Lipasti

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 507 - 520

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

PHP is the dominant server-side scripting language used to implement dynamic web content. Just-in-time compilation, as implemented in Facebook's state-of-the-art HipHopVM, helps mitigate the poor performance of PHP, but substantial overheads remain, especially for realistic, large-scale PHP applications. This paper analyzes such applications and shows that there is little opportunity for conventional...

chapter

Analysis of K-bit pipelined processor cores using perl benchmarking

Eze Victor Chisom, K. C. Okafor, A. A. Obayi, Okoro Nkem Jennifer, more

2017 International Conference on Computing Networking and Informatics (ICCNI) > 1 - 7

2017 International Conference on Computing Networking and Informatics (ICCNI)

In today's high performance computing (HPC) environments, analyzing and predicting the performance of multiple-processor systems (clusters cores) on critical workloads remains a challenge. This is as a result of the key metrics that influences system's behavior. Busty arrivals in HPCs demand either a shared memory-parallel architecture or pipelined dataflow architecture. At present, a processor model...

chapter

Fault tolerant electronic system design

Boyang Du, Luca Sterpone

2017 IEEE International Test Conference (ITC) > 1 - 6

2017 IEEE International Test Conference (ITC)

Due to technology scaling, which means smaller transistor, lower voltage and more aggressive clock frequency, VLSI devices are becoming more susceptible against soft errors. Especially for those devices deployed in safety- and mission-critical applications, dependability and reliability are becoming increasingly important constraints during the development of system on/around them. Other phenomena...

chapter

Hardware trojan detection through information flow security verification

Adib Nahiyan, Mehdi Sadi, Rahul Vittal, Gustavo Contreras, more

2017 IEEE International Test Conference (ITC) > 1 - 10

2017 IEEE International Test Conference (ITC)

Semiconductor design houses are increasingly becoming dependent on third party vendors to procure intellectual property (IP) and meet time-to-market constraints. However, these third party IPs cannot be trusted as hardware Trojans can be maliciously inserted into them by untrusted vendors. While different approaches have been proposed to detect Trojans in third party IPs, their limitations have not...

chapter

Evaluating Effect of Write Combining on PCIe Throughput to Improve HPC Interconnect Performance

Mahesh Chaudhari, Kedar Kulkarni, Shreeya Badhe, Vandana Inamdar

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 639 - 640

2017 IEEE International Conference on Cluster Computing (CLUSTER)

HPC interconnect is a very crucial component of any HPC machine. Interconnect performance is one of the contributing factors for overall performance of HPC system. Most popular interface to connect Network Interface Card (NIC) to CPU is PCI express (PCIe). With denser core counts in compute servers and increasingly maturing fabric interconnect speeds, there is need to maximize the packet data movement...

chapter

Extending Skel to Support the Development and Optimization of Next Generation I/O Systems

Jeremy Logan, Jong Youl Choi, Matthew Wolf, George Ostrouchov, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 563 - 571

2017 IEEE International Conference on Cluster Computing (CLUSTER)

As the memory and storage hierarchy get deeper and more complex, it is important to have new benchmarks and evaluation tools that allow us to explore the emerging middleware solutions to use this hierarchy. Skel is a tool aimed at automating and refining this process of studying HPC I/O performance. It works by generating application I/O kernel/benchmarks as determined by a domain-specific model....

chapter

Value Based Scheduling for Oversubscribed Power-Constrained Homogeneous HPC Systems

Nirmal Kumbhare, Cihan Tunc, Dylan Machovec, Ali Akoglu, more

2017 International Conference on Cloud and Autonomic Computing (ICCAC) > 120 - 130

2017 International Conference on Cloud and Autonomic Computing (ICCAC)

Power-aware scheduling has become a critical research thrust for deploying exascale High Performance Computing (HPC) systems with limited power budget. Time-varying pricing of electricity with respect to the market demand and dynamic HPC workloads can lead to unpredictable operational cost, which complicates the scheduling decisions further. For an oversubscribed HPC system, value based scheduling...

chapter

Comparison of hardware and software implementations of selected lightweight block ciphers

William Diehl, Farnoud Farahmand, Panasayya Yalla, Jens-Peter Kaps, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Lightweight block ciphers are an important topic of research in the context of the Internet of Things (IoT). Current cryptographic contests and standardization efforts seek to benchmark lightweight ciphers in both hardware and software. Although there have been several benchmarking studies of both hardware and software implementations of lightweight ciphers, direct comparison of hardware and software...

chapter

Customised pearlmutter propagation: A hardware architecture for trust region policy optimisation

Shengjia Shao, Wayne Luk

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 6

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Reinforcement Learning (RL) is an area of machine learning in which an agent interacts with the environment by making sequential decisions. The agent receives reward from the environment to find an optimal policy that maximises the reward. Trust Region Policy Optimisation (TRPO) is a recent policy optimisation algorithm that achieves superior results in various RL benchmarks, but is computationally...

chapter

HPC-Oriented Toolchain for Hardware Simulators

Olivier Serres, Engin Kayraklioglu, Tarek El-Ghazawi

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 653 - 654

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Hardware design is an essential part of research in high performance computing. Initial efforts in hardware research consist of analyzing the design ideas in a software simulator. This allows chip designers to minimize amount of manufacturing that would be too costly and to avoid doing FPGA designs which are even more time consuming. Simulating a hardware design involves running many tests that try...

chapter

Shielding non-trusted IPs in SoCs

Festus Hategekimana, Taylor Whitaker, Md Jubaer Hossain Pantho, Christophe Bobda

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

This paper explores the use of hardware sand-boxes, conceptually similar to software sandboxes, for secure integration of non-trusted IPs in systems-on-chip (SoC) designs. The goal of the hardware sandbox is to only allow permissible interactions between the IP and the rest of the system. The hardware sandbox design achieves this by exposing the IP interface to isolated virtual resources and checking...

chapter

STRIPE: Signal selection for runtime power estimation

James J. Davis, Joshua M. Levine, Edward A. Stott, Eddie Hung, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 8

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Knowledge of power consumption at a subsystem level can facilitate adaptive energy-saving techniques such as power gating, runtime task mapping and dynamic voltage and/or frequency scahng. While we have the ability to attribute power to an arbitrary hardware system's modules in real time, the selection of the particular signals to monitor for the purpose of power estimation within any given module...

chapter

Evaluating high-level design strategies on FPGAs for high-performance computing

Artur Podobas, Hamid Reza Zohouri, Naoya Maruyama, Satoshi Matsuoka

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Field-Programmable Gate Arrays (FPGAs) are gaining considerable momentum in mainstream high-performance systems in recent years due to their flexibility and low power consumption. Still, FPGAs remain largely unavailable to software programmers due to programming and debugging difficulties that are inherent to standard Hardware Description Languages. The performance that hardware-oblivious software...

chapter

Scientific computing using consumer video-gaming embedded devices

Glenn Volkema, Gaurav Khanna

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 8

2017 IEEE High Performance Extreme Computing Conference (HPEC)

The performance of commodity video-gaming embedded devices (consoles, graphics cards, tablets, etc.) has been advancing at a rapid pace owing to strong consumer demand and stiff market competition. Gaming devices are currently amongst the most powerful and cost-effective computational technologies available in quantity. In this article, we evaluate a sample of current generation video-gaming devices...

chapter

POSTER: BACM: Barrier-Aware Cache Management for Irregular Memory-Intensive GPGPU Workloads

Yuxi Liu, Xia Zhao, Zhibin Yu, Zhenlin Wang, more

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT) > 140 - 141

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)

General-purpose workloads running on modern graphics processing units (GPGPUs) rely on hardware-based barriers to synchronize warps within a thread block (TB). However, imbalance may exist before reaching a barrier if a GPGPU workload contains irregular memory accesses, i.e., some warps may be critical while others may not. Ideally, cache space should be reserved for the critical warps. Unfortunately,...

chapter

Proxy Benchmarks for Emerging Big-Data Workloads

Reena Panda, Lizy Kurian John

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT) > 105 - 116

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)

Early design-space evaluation of computer-systems is usually performed using performance models such as detailed simulators, RTL-based models etc. Unfortunately, it is very challenging (often impossible) to run many emerging applications on detailed performance models owing to their complex application software-stacks, significantly long run times, system dependencies and the limited speed/potential...

chapter

Towards numerical benchmark for half-precision floating point arithmetic

Piotr Luszczek, Jakub Kurzak, Ichitaro Yamazaki, Jack Dongarra

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 5

2017 IEEE High Performance Extreme Computing Conference (HPEC)

With NVIDA Tegra Jetson X1 and Pascal P100 GPUs, NVIDIA introduced hardware-based computation on FP16 numbers also called half-precision arithmetic. In this talk, we will introduce the steps required to build a viable benchmark for this new arithmetic format. This will include the connections to established IEEE floating point standards and existing HPC benchmarks. The discussion will focus on performance...

chapter

POSTER: DaQueue: A Data-Aware Work-Queue Design for GPGPUs

Yashuai Lu, Libo Huang, Li Shen

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT) > 142 - 143

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)

Work-queue is an effective approach for mapping irregular-parallel workloads to GPGPUs. It can improve the utilization of SIMD units by only processing useful works which are dynamically generated during execution. As current GPGPUs lack necessary supports for work-queues, a software-based work-queue implementation often suffers from memory contention and load balancing issues. We present a novel...

chapter

Trace-based method for big data memory characteristics research

Jianqiao Ma, Qi Yu, Libo Huang, Cheng Qian, more

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1023 - 1027

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Big data has exacerbated the so-called “memory wall” problem. To study the memory characteristics of big data applications has become an important issue in the high end computing community. In this paper, we propose a trace-based method based on the trace files generated by simulators, which captures memory access information in different memory hierarchies and aggregates information to get memory...

chapter

An FPGA-based platform for non volatile memory emulation

Taemin Lee, Sungjoo Yoo

2017 IEEE 6th Non-Volatile Memory Systems and Applications Symposium (NVMSA) > 1 - 4

2017 IEEE 6th Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Non volatile memory (NVM) is expected to enrich the next generation computer system. However, designers have difficulties in exploring new software and hardware design ideas based on NVM due to the limitations of current simulation-based evaluation, e.g., slow runtime. In order to resolve this problem, we present an open, reliable, and versatile hardware platform for NVM emulation. We built Zynq FPGA-based...

INFONA - science communication portal

Search results

Architectural support for server-side PHP processing

Analysis of K-bit pipelined processor cores using perl benchmarking

Fault tolerant electronic system design

Hardware trojan detection through information flow security verification

Evaluating Effect of Write Combining on PCIe Throughput to Improve HPC Interconnect Performance

Extending Skel to Support the Development and Optimization of Next Generation I/O Systems

Value Based Scheduling for Oversubscribed Power-Constrained Homogeneous HPC Systems

Comparison of hardware and software implementations of selected lightweight block ciphers

Customised pearlmutter propagation: A hardware architecture for trust region policy optimisation

HPC-Oriented Toolchain for Hardware Simulators

Shielding non-trusted IPs in SoCs

STRIPE: Signal selection for runtime power estimation

Evaluating high-level design strategies on FPGAs for high-performance computing

Scientific computing using consumer video-gaming embedded devices

POSTER: BACM: Barrier-Aware Cache Management for Irregular Memory-Intensive GPGPU Workloads

Proxy Benchmarks for Emerging Big-Data Workloads

Towards numerical benchmark for half-precision floating point arithmetic

POSTER: DaQueue: A Data-Aware Work-Queue Design for GPGPUs

Trace-based method for big data memory characteristics research

An FPGA-based platform for non volatile memory emulation

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options