The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present an energy model for a graphics processing unit (GPU) that is based on the amount and type of work performed in various parts of the unit. By designing and running directed tests on a GPU, we measure the energy consumed when performing different arithmetic and memory operations, allowing us to accurately predict the energy that any arbitrary mix of operations will take. With some knowledge...
We propose a novel hardware design for decoding compressed floating-point textures in a graphics processing unit (GPU). Our decoder is based on the NXR texture format, which provides lossy, fixed-rate 6:1 compression for floating-point textures. Our design exploits the constraints of the compressed pixel blocks to produce the correct output using only fixed-point arithmetic. This results in significantly...
This paper shows approaches to accelerate pixel-level image fusion speed using graphics hardware. Recently, to improve visibility through maximization of information collected through development of various sensors and improvement of sensing technology, the importance of not only development of new fusion algorithm but speed of fusion process is increasing. Though specialized fusion boards for real...
In spite of graphics hardware advancements, graphics memory is still a scarce resource for usual applications. Besides, for most raster-based applications, the available bandwidth is one important limiting factor for increasing performance in the system. Texture compression addresses both of these problems. We introduce a new technique for compression of textures synthesized from samples. The compressed...
Due to the thirst for bandwidth resource and the limitation of device size, mobile GPUs face more challenges than their desktop counterparts do when seeking realistic visual quality. To alleviate the severely restricted situation mobile GPUs in, we propose a new universal buffer compression method to handle both color and depth data with the same hardware unit. While the previous state-of-the-arts...
A case study describing the influence of TFT LCD graphics on automotive radiated emissions testing and its impact to the FM band receiver. The underlying root-cause of the EMI issue is determined using a novel technique that decodes the display's graphics into the transmitted RGB data and predicts the data's impact to radiated emissions. The countermeasure implemented to resolve the issue is equally...
A parallel unified processor for graphics and vision is developed. It achieves 371.9G0PS/W in full operation through a 6-way VLIW datapath, reconfigurable processing elements for graphics and vision mode, and a pixel arranger for data-level parallelism. The pose-estimation engine achieves 0.89 μW/fps for marker-based augmented reality.
Coons patch is important element in 3D modeling and simulation. In order to use general framework of GPU to constitute coons patches, one decompose method of the traditional coons patches was developed. The result shows that runtime of GPU based algorithm tend to increase slowly while the number of interpolation points increase.
We introduce the general pinhole camera (GPC), defined by a center of projection (i.e., the pinhole), an image plane, and a set of sampling locations in the image plane. We demonstrate the advantages of the GPC in the contexts of remote visualization, focus-plus-context visualization, and extreme antialiasing, which benefit from the GPC sampling flexibility. For remote visualization, we describe a...
High assurance MILS and MLS systems require strict limitation of the interactions between different security compartments based on a security policy. Virtualization can be used to provide a high degree of separation in such systems. Even with perfect isolation, however, the I/O devices are shared between different security compartments. Among the I/O controllers, the graphics subsystem is the largest...
A new fast line drawing algorithm that is different from the traditional Bresenham algorithm is presented in this paper. A line is treated as an aggregation of several line segments and the Y coordinate differences of candidate pixel points in every step of traditional algorithm are replaced by the length errors of each segments in this new algorithm. Each operation and judgment can generate a line...
In this paper, we propose a novel architecture of a general graphics shader processor without a dedicated hardware. Recently, mobile devices require the high performance graphics processor as well as the small size and low power. The proposed shader processor is a GP-GPU (General-Purpose computing on Graphics Processing Units) to execute the whole OpenGL ES 2.0 graphics pipeline by using shader instructions...
Motivated by the challenging questions of todays sinologists we are developing an automated system for processing of ancient Chinese inscriptions (sutras). As these inscriptions are not accessible due to location or damage our input data are noisy images of paper showing the texture of stones together with the inscriptions transfered by charcoal or pencil. Due to the vast amount and large sizes of...
Due to the great processing power available on today's Graphics Processing Units (GPU), we studied the suitability of mapping statistical testing algorithms to this specialized hardware. Out of the testing algorithms proposed by the National Institute of Standards and Technology (NIST), only some were suitable for implementation on GPU, due to the computational format and restrictions of the hardware...
We present and analyze two new communication libraries, cudaMPI and glMPI, that provide an MPI-like message passing interface to communicate data stored on the graphics cards of a distributed-memory parallel computer. These libraries can help applications that perform general purpose computations on these networked GPU clusters. We explore how to efficiently support both point-to-point and collective...
The increasing demand for application specific processing on portable devices is driving the design with highly efficient hardware. Many applications are streamlined, and the delays of all streamline stages better be equalized. Unfortunately, loads at each stage of streamlined processing may vary depending on input data, making load monitoring and balancing very desirable hardware features. The aim...
As a free application programming interface (API) for hardware-accelerated two-dimensional vector and raster graphics, OpenVG is becoming the standard for hardware development. This paper firstly proposes several optimization methods for OpenVG implementation, such as loop unrolling, operation transformation, function in lining, vectorization and address assignment, based on the hardware architecture...
The rasterization stage in a graphics processing unit (GPU), which consists of triangle setup, rasterization, and parameter interpolation with plane equations, always requires huge operations and is usually the bottleneck of the performance. For real-time applications, a universal rasterizer (UR) with edge equations and a tile-scan triangle traversal algorithm are proposed for low cost graphics rendering...
In this paper, we proposed a low complexity subdivision algorithm to approximate Phong shading. It is a combination of a subdivision scheme using forward difference and a recovery scheme to prevent rasterization anomaly. Dual space subdivision, triangle filtering and variable sharing schemes are also proposed to reduce the computation. Compared with the conventional recursive subdivision shading algorithm,...
In this paper, a real-time 3D video synthesis method suitable for implementation on commodity graphic hardware is presented. The system consists of pre-calibrated binocular stereo cameras and an NVIDIA GeForce 8 Series graphic card. Recently, most research has focused on improving the quality of depth maps, which is usually time-consuming and unsuitable for real-time reconstruction. In our method,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.