The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Dataflow models of computation have early on been acknowledged as an attractive methodology to describe parallel algorithms, hence they have become highly relevant for programming in the current multicore processor era. While several frameworks provide tools to create dataflow descriptions of algorithms, generating parallel code for programmable processors is still sub-optimal due to the scheduling...
Localization of wireless networks has remained an active research topic for more than a decade, as it finds more and more applications in numerous scenarios, including environment surveillance, asset tracking, and healthcare monitoring, etc. In this paper, we consider the maximum likelihood localization of a network where some of the nodes are GPS-capable while the others attempt to achieve self localization...
ForkJoin framework is a widely used parallel programming framework upon which both core concurrency libraries and real-world applications are built. Beneath its simple and user-friendly APIs, ForkJoin is a sophisticated managed parallel runtime unfamiliar to many application programmers: the framework core is a work-stealing scheduler, handles fine-grained tasks, and sustains the pressure from automatic...
We propose a novel approach for visualizing reverse-engineered Unified Modeling Language (UML) diagrams (class, object, and sequence) to improve Object-Oriented Program (OOP) comprehension on a web-based programming environment, JaguarCode. It aims to help students better understand static structure and dynamic behavior of Java programs and object-oriented programming concepts. This paper presents...
While GPUs are becoming common in HPC systems, the CPU is still responsible for managing both GPU-side and CPU-side compute, communication, and synchronization operations. For instance, if a result from a GPU-side computation is to be transferred to a remote destination, then the CPU must synchronize on GPU compute completion issuing a communication operation. Both CPU cycles and energy are consumed...
Transactional Memory (TM) promises both to provide a scalable mechanism for synchronization in concurrent programs, and to offer ease-of-use benefits to programmers. The most straightforward use of TM in real-world programs is in the form of Transactional Lock Elision (TLE). In TLE, critical sections are attempted as transactions, with a fall-back to a lock if conflicts manifest. Thus TLE expects...
The emergence of new types of high performance hardware also drives the need for new programming models. The Open Community Runtime (OCR) proposal uses a task-based programming model to target some of these architectures. In OCR, the whole program from start to end needs to be expressed using tasks and synchronized using task-to-task dependences, significantly limiting the applicability and usefulness...
The bidirectional model transformation (BX) comprises a forward transformation get and a backward transformation put. Given that get may be an information-loss transformation, the behavior of put may be uncertain. An uncertain put produces many valid outputs that fit different application scenarios. This paper proposes an approach to variability management in BX to enable put to generate an output...
Shared memory and message passing are traditional parallel programming models used on multiprocessor system-on-chip environments. Underlying models are traditionally meant for static scenarios where all communicating entities and their intercommunication patterns are known a priori by the software engineer. The systems design following such programming models became complex due to dynamic behavior...
In this paper, we present a novel and simple circuit for accurately programming memristors in both an incremental and a decremental fashion. One of the main constituting blocks of the circuit is an inverting voltage amplifier block within which the memristor forms a gain stage with a reference resistor. Memristor resistance modulation is achieved by means of auto-tuning operational amplifier's gain...
Clusters equipped with accelerators such as graphics processing unit (GPU) and Many Integrated Core (MIC) are widely used. For such clusters, programmers write programs for their applications by combining MPI with one of the available accelerator programming models. In particular, OpenACC enables programmers to develop their applications easily, but with lower productivity owing to complex MPI programming...
In this paper, we provide comparison of languagefeatures and runtime systems of commonly used threadingparallel programming models for high performance computing, including OpenMP, Intel Cilk Plus, Intel TBB, OpenACC, NvidiaCUDA, OpenCL, C++11 and PThreads. We then report ourperformance comparison of OpenMP, Cilk Plus and C++11 fordata and task parallelism on CPU using benchmarks. The resultsshow...
The tasking model of OpenMP 4.0 supports both nesting and the definition of dependences between sibling tasks. A natural way to parallelize many codes with tasks is to first taskify the high-level functions and then to further refine these tasks with additional subtasks. However, this top-down approach has some drawbacks since combining nesting with dependencies usually requires additional measures...
Previous techniques on concurrency testing have mainly focused on exploring the interleaving space of manually written test code to expose faulty interleavings of shared memory accesses. These techniques assume the availability of failure-inducing tests. In this paper, we present AutoConTest, a coverage-driven approach to generate effective concurrent test code that achieve high interleaving coverage...
Software interfaces today generally fall at either end of a spectrum. On one end are programmable systems, which allow expert users (i.e. programmers) to write software artifacts that describe complex abstractions, but programs are disconnected from their eventual output. On the other end are domain-specific graphical user interfaces (GUIs), which allow end users (i.e. non-programmers) to easily create...
In self-adaptive systems, an adaptation strategy can apply to several implementations of a target system. Reusing this strategy requires models of the target system that are independent of its implementation. In particular, configuration files must be transformed into abstract configurations, but correctly synchronizing these two representations is not trivial. We propose an approach that uses putback-based...
A framework to integrate different artificial intelligence and machine learning algorithms is combined with an execution framework to create a powerful cloud computing system development platform. By providing an execution framework and control software that is native to cloud architectures and supports interactivity and time synchronization, the true utility of cloud computing and "big data...
MapReduce is an important programming model for processing in distributed environments. Compared to other distributed programming models, MapReduce reduces communication overheads between computers and improves fault tolerance. However, the MapReduce model does not allow for automatic synchronization between jobs. A large number of data analytics algorithms use a recursive divide-and-conquer approach,...
Graph mining is widely used in fields like social network analysis. The synchronous vertex-centric frameworks strike better balance between the performance and ease-of-use, so they are widely used in realistic. However, traditional architectures of this type, like the vertex-based push architecture and GAS, are encumbered by high communication costs. In this paper we proposed a new replica-based push...
Nearest-neighbor communication is one of the most important communication patterns appearing in many scientific applications. In this paper, we discuss the results of applying UPC++, a library-based partitioned global address space (PGAS) programming extension to C++, to an adaptive mesh framework (BoxLib), and a full scientific application GTC-P, whose communications are dominated by the nearest-neighbor...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.