The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper we introduce a novel, dense, system-on-chip many-core Lenovo NeXtScale System® server based on the Cavium THUNDERX® ARMv8 processor that was designed for performance, energy efficiency and programmability. THUNDERX processor was designed to scale up to 96 cores in a cache coherent, shared memory architecture. Furthermore, this hardware system has a power interface board (PIB) that measures...
With the widespread adoption of virtualization, intrusion detection systems (IDSes) are increasingly being deployed in virtualized environments. When securing an environment, IT security officers are often faced with the question of how accurate deployed IDSes are at detecting attacks. To this end, metrics for assessing the attack detection accuracy of IDSes have been developed. However, these metrics...
Aiming at the characteristics of SIFT (Scale Invariant Feature Transform) algorithm which has large amount of calculation and can be highly paralleled, we propose an optimized FPGA implementation so that it can be accelerated on hardware. In this method, we firstly simplify the process of filtering image and generating Gaussian pyramids through selecting appropriate parameters and hardware structure,...
A common approach for improving application performance is to process its working set from memory. For datasets that do not fit into DRAM of a single machine this leads to a design of scale-out applications, where the application dataset is partitioned and processed by a cluster of machines. Performance of distributed memory applications, implemented using MPI (Message Passing Interface), inherently...
This paper examines a generalized version of Preemptive RANSAC for visual motion estimation. The approach described employs the BRUMA function for dealing with varying block sizes and the percentages of hypotheses to be removed during the hypotheses rejection phase. The generation of a flexible number of hypotheses is also performed in order to balance the preemption scheme. Experiments were performed...
Distributed cyber-physical systems (CPS) are increasingly provided with an accurate and precise common sense of time using a variety of well established time distribution methods such as GNSS, IEEE 1588 and others. Less attention has been paid to the effective use of time in CPS and techniques and components to support such use. This paper reviews these topics and discusses critical components that...
Many modern applications (such as multimedia processing, machine learning, and big-data analytics) exhibit an inherent tradeoff between performance and the accuracy of the produced results. These applications allow us to investigate new, more aggressive program optimizations. We present a novel approximate optimization framework based on accuracy-aware program transformations. These transformations...
Deep neural networks (DNNs) have recently proved their effectiveness in complex data analyses such as object/speech recognition. As their applications are being expanded to mobile devices, their energy efficiencies are becoming critical. In this paper, we propose a novel concept called big/LITTLE DNN (BL-DNN) which significantly reduces energy consumption required for DNN execution at a negligible...
The inherent nondeterminism present in reduction operations on an exascale system, coupled with the nonassociativity of floating-point arithmetic, makes achieving reproducible results difficult or impossible. Work investigating the irreproducibility phenomenon has generally proceeded along one of two veins: (1) development of algorithms that produce reproducible numerical results irrespective of nondeterminism...
In PHIL simulations different time delays are introduced. Although it can be reduced, there is always some time delay. As a consequence, when the device under test is part of a low impedance power system such as: microgrids, marine or aero power systems, the simulation process becomes challenging due to the poor accuracy of the results achieved by the introduction of the time delay. Therefore, in...
Due to the increase in network attacks, anomaly detection has gained importance. In this paper, we present and investigate the idea of institutions cooperating for performing anomaly detection, i.e. institutions jointly analyzing their network traffic, in order to identify malicious attacks, using classification-based machine learning techniques. We compare the results of such a collaborative analysis...
As process technology scales, electronic devices become more susceptible to transient faults induced by radiation. Symptom-based detection techniques provide promising low-cost and effective solutions, but could hardly catch faults that produce silent data corruptions (SDCs). Identifying and understanding instructions that cause SDCs is crucial to the development of program-level detectors. This paper...
This paper presents hardware constraints analysis of Gabor filtering operation for its hardware implementation in a real time Facial Expression Recognition System (FERS). Gabor filter is the most common feature extractor employed for the realization of such system. Feature extraction using Gabor filter is efficient and has better discrimination capability. In this work, we have employed software-based...
While sub/near-threshold design offers the minimal power and energy consumption, such approach strongly deteriorates circuit performances and robustness against PVT (process/voltage/temperature) variations, leading to gigantic speed penalties and large silicon areas. Inexact and approximate circuit design can address these issues by trading calculation accuracy for better silicon area, circuit speed...
This paper presents algorithm and digital hardware design, inspired by biological spiking neural networks, to perform unsupervised, online spike-clustering with high accuracy and low-power consumption in the context of deep-brain sensing and stimulation systems. The proposed hardware contains 1220 digital neurons and 4.86k latch-based synapses, and achieves the average sorting accuracy of 91% whereas...
Software evaluation of elementary functions usually requires three steps: a range reduction, a polynomial evaluation, and a reconstruction step. These evaluation schemes are designed to give the best performance for a given accuracy, which requires a fine control of errors. One of the main issues is to minimize the number of sources of error and/or their influence on the final result. The work presented...
We present a wearable system that uses ambient electromagnetic interference (EMI) as a signature to identify electronic devices and support proxemic interaction. We designed a low cost tool, called EMI Spy, and a software environment for rapid deployment and evaluation of ambient EMI-based interactive infrastructure. EMI Spy captures electromagnetic interference and delivers the signal to a user;s...
The atan2 function computes the polar angle arctan(y/x) of a point given by its cartesian coordinates. It is widely used in digital signal processing to recover the phase of a signal. This article studies for this context the implementation of atan2 with fixed-point inputs and outputs. It compares the prevalent CORDIC shift-and-add algorithm to two multiplier-based techniques. The first one computes...
Inexact and approximate circuit design is a promising approach to improve performance and energy efficiency in technology-scaled and low-power digital systems. Such strategy is suitable for error tolerant applications involving perceptive or statistical outputs. This paper reviews two established techniques applicable to arithmetic units: circuit pruning and carry speculation. A critical comparative...
In recent years, the Phasor Measurement Unit (PMU) technology is rapidly evolving towards the potential deployment also in power distribution systems (DSs). In general, this specific field of applications requires PMUs whose accuracy levels are beyond those required by the IEEE Std. C37.118. Additionally, there is the need to define the architecture of an associated calibration system capable to assess...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.