Search results

chapter

Time synchronization for an asynchronous embedded CAN network on a multi-processor system on chip

Gabriela Breaban, Sander Stuijk, Kees Goossens

2017 IEEE International Symposium on Precision Clock Synchronization for Measurement, Control, and Communication (ISPCS) > 1 - 6

2017 IEEE International Symposium on Precision Clock Synchronization for Measurement, Control, and Communication (ISPCS)

Distributed cyber-physical systems cover a wide range of applications such as automotive, avionic or industrial automation. These applications require a global notion of time to fullfill their timing requirements. Multi-processor system on chips (MPSOCs) are an attractive implementation option since they offer several benefits such as parallelism and power efficiency. However, MPSOCs have a Globally...

chapter

Fine-Grained Parallel Solution for Solving Sparse Triangular Systems on Multicore Platform Using OpenMP Interface

Sirine Marrakchi, Mohamed Jemni

2017 International Conference on High Performance Computing & Simulation (HPCS) > 659 - 666

2017 International Conference on High Performance Computing & Simulation (HPCS)

This paper describes and analyses a novel method to improve the parallel performance for solving sparse triangular systems (spTRSV). The main objective of this study consists in reducing the total idle time of processors as well as the execution time. Also, the developed solution is suitable for sparse and band structures. To evaluate and validate our contribution, a series of experiments have been...

chapter

MCRTsim: A simulation tool for multi-core real-time systems

Jun Wu, Yu-Cheng Huang

2017 International Conference on Applied System Innovation (ICASI) > 461 - 464

2017 International Conference on Applied System Innovation (ICASI)

This paper presents an open source task scheduling simulator, called MCRTsim, for real-time systems with uniprocessors, multiprocessors, and multi-core processors. It contains a task set generator, a set of real-time schedulers and synchronization protocols, and a comprehensive set of tools including visualized execution tracer, schedulability analyzer, and measurement and statistic modules. Therefore,...

chapter

Work Partitioning on Parallel and Distributed Agent-Based Simulation

Gennaro Cordasco, Carmine Spagnuolo, Vittorio Scarano

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1472 - 1481

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Work partitioning is a key challenge with ap- plications in many scientific and technological fields. The problem is very well studied with a rich literature on both distributed and parallel computing architectures. In this paper we deal with the work partitioning problem for parallel and distributed agent-based simulations which aims at (i) balancing the overall load distribution, (ii) minimizing,...

chapter

ParaDiMe: A Distributed Memory FPGA Router Based on Speculative Parallelism and Path Encoding

Chin Hau Hoo, Akash Kumar

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 172 - 179

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

The increase in speed and capacity of FPGAs is faster than the development of effective design tools to fully utilize it, and routing of nets remains as one of the most time-consuming stages of the FPGA design flow. While existing works have proposed methods of accelerating routing through parallelization, they are limited by the memory architecture of the system that they target. In this paper, we...

chapter

Timing-Anomaly Free Dynamic Scheduling of Task-Based Parallel Applications

Petros Voudouris, Per Stenstrom, Risat Pathan

2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) > 365 - 376

2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS)

Multicore architectures can provide high predictable performance through parallel processing. Unfortunately, computing the makespan of parallel applications is overly pessimistic either due to load imbalance issues plaguing static scheduling methods or due to timing anomalies plaguing dynamic scheduling methods. This paper contributes with an anomaly-free dynamic scheduling method, called Lazy, which...

chapter

The PARSEC benchmark suite: Characterization and architectural implications

Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, Kai Li

2008 International Conference on Parallel Architectures and Compilation Techniques (PACT) > 72 - 81

2008 International Conference on Parallel Architectures and Compilation Techniques (PACT)

This paper presents and characterizes the Princeton Application Repository for Shared-Memory Computers (PARSEC), a benchmark suite for studies of Chip-Multiprocessors (CMPs). Previous available benchmarks for multiprocessors have focused on high-performance computing applications and used a limited number of synchronization methods. PARSEC includes emerging applications in recognition, mining and...

chapter

Legato: End-to-end bounded region serializability using commodity hardware transactional memory

Aritra Sengupta, Man Cao, Michael D. Bond, Milind Kulkarni

2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 1 - 13

2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Shared-memory languages and systems provide strong guarantees only for well-synchronized (data-race-free) programs. Prior work introduces support for memory consistency based on region serializability of executing code regions, but all approaches incur serious limitations such as adding high run-time overhead or relying on complex custom hardware. This paper explores the potential for leveraging widely...

chapter

Parallel top-k subgraph query in massive graphs: Computing from the perspective of single vertex

Jianliang Gao, Bo Song, Ping Liu, Weimao Ke, more

2016 IEEE International Conference on Big Data (Big Data) > 636 - 645

2016 IEEE International Conference on Big Data (Big Data)

In the real world, many problems on massive graphs can be mapped to an underlying critical problem of discovering top-k subgraphs. For massive graphs, subgraph queries may have enormous number of matches, and so it is inefficient to compute all matches when only top-k matches are desired. Meanwhile, parallel algorithm is urgent for the scalability of massive graph computing. In this paper, we address...

chapter

Superlinear speedup in HPC systems: Why and when?

Sasko Ristov, Radu Prodan, Marjan Gusev, Karolj Skala

2016 Federated Conference on Computer Science and Information Systems (FedCSIS) > 889 - 898

2016 Federated Conference on Computer Science and Information Systems (FedCSIS)

The speedup is usually limited by two main laws in high-performance computing, that is, the Amdahl's and Gustafson's laws. However, the speedup sometimes can reach far beyond the limited linear speedup, known as superlinear speedup, which means that the speedup is greater than the number of processors that are used. Although the superlinear speedup is not a new concept and many authors have already...

chapter

Experimental Validation and Exploration of a New Kind of Synchronization in Linux

Fangfang Zhu, Yucong Chen, Jianqiang Wang, Gaofeng Zhang, more

2016 International Symposium on System and Software Reliability (ISSSR) > 91 - 96

2016 International Symposium on System and Software Reliability (ISSSR)

PWCS (Probabilistic Write / Copy-Select) is a new kind of lock-free synchronization mechanism with wait-free characteristics proposed by Nicholas Mc Guire at the 13th real-time Linux workshop, which utilizes the inherent randomness of the modern computer systems. It aims at addressing the multi-reader - single-writer problem in Linux. Based on the original label-based PWCS, we propose a hash-based...

chapter

ParaFRo: A hybrid parallel FPGA router using fine grained synchronization and partitioning

Chin Hau Hoo, Yajun Ha, Akash Kumar

2016 26th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 11

2016 26th International Conference on Field Programmable Logic and Applications (FPL)

Routing of nets is one of the most time-consuming steps in the FPGA design flow. While existing works have described ways of accelerating the process through parallelization, they are not scalable. In this paper, we propose ParaFRo, a two-phase hybrid parallel FPGA router using fine-grained synchronization and partitioning. The first phase of the router aims to exploit the maximum parallelism available...

chapter

Automatic Code Parallelization with OpenMP task constructs

Manju Mathews, Jisha P Abraham

2016 International Conference on Information Science (ICIS) > 233 - 238

2016 International Conference in Information Science (ICIS)

Multi-core processors are very common in the form of dual-core and quad-core processors. To take advantage of multiple cores, parallel programs are written. Existing legacy applications are sequential and runs on multiple cores utilizing only one core. Such applications should be either rewritten or parallelized to make efficient use of multiple cores. Manual parallelization requires huge efforts...

chapter

On the relaxed synchronization for massively parallel numerical algorithms

Kooktae Lee, Raktim Bhattacharya

2016 American Control Conference (ACC) > 3334 - 3339

2016 American Control Conference (ACC)

This paper presents a novel relaxed synchronization strategy for generic numerical algorithms executed in distributed and parallel computing systems. Large problems are efficiently solved if they can be parallelized. However, as the number of processing elements increases, the communication, necessary to synchronize intermediate computation across processing elements, increases and soon becomes a...

chapter

LSPP Introduction and Committees

Kevin J. Barker, Chris D. Carothers, Eric van Hensbergen

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1339

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

chapter

Communication aware multiprocessor binding for shared memory systems

Shreya Adyanthaya, Marc Geilen, Twan Basten, Jeroen Voeten, more

2016 11th IEEE Symposium on Industrial Embedded Systems (SIES) > 1 - 10

2016 11th IEEE Symposium on Industrial Embedded Systems (SIES)

We present a three-step binding algorithm for applications in the form of directed acyclic graphs (DAGs) of tasks with deadlines, that need to be bound to a shared memory multiprocessor platform. The aim of the algorithm is to obtain a good binding that results in low makespans of the schedules of the DAGs. It first clusters tasks assuming unlimited resources using a deadline-aware shared memory extension...

chapter

μStreams: a tool for automated streaming pipeline generation on soft-core processors

Kris Heid, Jan Weber, Christian Hochberger

2016 International Conference on FPGA Reconfiguration for General-Purpose Computing (FPGA4GPC) > 25 - 30

2016 International Conference on FPGA Reconfiguration for General-Purpose Computing (FPGA4GPC)

FPGAs have grown considerably in the past years. In the meantime it is possible to implement several soft-core processors in one FPGA. This enables considerable parallelism for the developer. Unfortunately, most application code is still available in sequential form. Thus, in this contribution we present a tool that enables the automated transformation of an application into a streaming pipeline using...

chapter

IPNoSys II — A new architecture for IPNoSys programming model

Thiago R. B. S. Soares, Ivan Saraiva Silva, Silvio R. Fernandes

2015 28th Symposium on Integrated Circuits and Systems Design (SBCCI) > 1 - 7

2015 28th Symposium on Integrated Circuits and Systems Design (SBCCI)

The quest for more performance frequently finds some interesting answers in unconventional computing. IPNoSys is a parallel processing platform for packet-based applications. Its hardware architecture is based on network-on-chip (NoC) structure and its applications are executed while the packets are routed through the NoC. This paper presents a new architecture to IPNoSys programming model. IPNoSys...

chapter

Improving scalability of CMPs with dense ACCs coverage

Nasibeh Teimouri, Hamed Tabkhi, Gunar Schirner

2016 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 1610 - 1615

2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Utilizing Hardware Accelerators (ACCs) is a promising solution to improve performance and power efficiency of Chip Multi-Processors (CMPs). However, new challenges arise with the trend of shifting from few ACCs (with sparse ACCs coverage) to many ACCs (denser ACCs coverage) on a chip. The primary challenges are a lack of clear semantics in ACC communication as well as a processor-centric view for...

chapter

Small-World Architecture for Parallel Processors

Hideki Mori, Minoru Uehara

2016 30th International Conference on Advanced Information Networking and Applications Workshops (WAINA) > 644 - 648

2016 30th International Conference on Advanced Information Networking and Applications Workshops (WAINA)

The CPU module is composed of networks including a lot of multiprocessors, and the parallel processing is done between such processors. The most important elements in a VLSI multiprocessor network are the component of networks. The key in the network is to execute the communication between a lot of nodes faultlessly while securing the scalability. In this paper, we introduce a parallel architecture...

INFONA - science communication portal

Search results

Time synchronization for an asynchronous embedded CAN network on a multi-processor system on chip

Fine-Grained Parallel Solution for Solving Sparse Triangular Systems on Multicore Platform Using OpenMP Interface

MCRTsim: A simulation tool for multi-core real-time systems

Work Partitioning on Parallel and Distributed Agent-Based Simulation

ParaDiMe: A Distributed Memory FPGA Router Based on Speculative Parallelism and Path Encoding

Timing-Anomaly Free Dynamic Scheduling of Task-Based Parallel Applications

The PARSEC benchmark suite: Characterization and architectural implications

Legato: End-to-end bounded region serializability using commodity hardware transactional memory

Parallel top-k subgraph query in massive graphs: Computing from the perspective of single vertex

Superlinear speedup in HPC systems: Why and when?

Experimental Validation and Exploration of a New Kind of Synchronization in Linux

ParaFRo: A hybrid parallel FPGA router using fine grained synchronization and partitioning

Automatic Code Parallelization with OpenMP task constructs

On the relaxed synchronization for massively parallel numerical algorithms

LSPP Introduction and Committees

Communication aware multiprocessor binding for shared memory systems

μStreams: a tool for automated streaming pipeline generation on soft-core processors

IPNoSys II — A new architecture for IPNoSys programming model

Improving scalability of CMPs with dense ACCs coverage

Small-World Architecture for Parallel Processors

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options