The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
As the technology scales toward deeper submicron, system-on-chip designs have migrated from fairly simple single processor and memory designs to relatively complicated systems with higher communication requirements. Network-on-chip architectures emerged as promising solutions for future system-on-chip communication architecture designs. However, the switching and routing algorithm design of network-on-chip...
The parallelism of hardware and the dynamic reconfigurability of FPGAs enable multiple hardware tasks to run concurrently, and also time-share resources by being swapped in and out of the device during runtime. More than ever before, these capabilities are being employed in systems with high-reliability requirements. To improve reliability, a method often used is circuit relocation. However, the static...
CPU-GPU heterogeneous systems are emerging are emerging as architectures of choice for high-performance energy-efficient computing. Designing on-chip interconnects for such systems is challenging: CPUs typically benefit greatly from optimizations that reduce latency, but rarely saturate bandwidth or queueing resources. In contrast, GPUs generate intense traffic that produces local congestion, harming...
The POWER9™ Processor in 14 nm SOI FinFET technology makes use of 7 different families of arrays. This paper gives an overview on advantages of different implementations, focusing on two key innovations introduced with this processor generation: Fast and low-latency write assist schemes for single-voltage performance arrays, as well as a new methodology, the synthesized soft arrays, to enable significant...
System-Level simulator is proposed to determine the ability of synchronous and asynchronous NoCs to alleviate the process variation effect. Throughput variation and different delay components variation are provided by the newly developed framework. System-Level simulation shows similarities with circuit-level simulation in terms of behavior and performance variation trend when moving from one technology...
High performance chip design is always a hot topic in integrated circuit (IC) field. Clock design plays a critical role in improving chip performance and affecting power consumption. The regular clock layout has always been the ideal way to improve the timing of results. In this paper, we propose a symmetrical clock tree synthesis algorithm for top-level design, including tree architecture planning,...
Wide wire sizes are often used in clock trees to improve timing characteristics and reduce electromigration effects. Recent research suggests the attractiveness of wide wires is affected by the forbidden pitch issues in the lithography of sub-20nm technologies. Parallel wiring is a recently proposed technique to get around these lithography issues in the routing stage of the ASIC design flow. Routing...
We propose a new sensor MAC protocol, called Bird- MAC, which is highly energy efficient in the applications where sensors periodically report monitoring status with a very low rate, as in structural health monitoring and static environmental monitoring. Two key design ideas of Bird-MAC are: (a) no need of early-wake-up of transmitters and (b) taking the right balance between synchronization and coordination...
A clock generation system for a 1GS/s 8-bit subranging time-interleaved analog-to-digital converter (ADC) is introduced. General timing considerations for time-interleaved ADCs are reviewed prior to describing the design methodology for a prototype ADC. This hybrid ADC architecture contains four time-interleaved combined sample-and-hold and capacitive digital-to-analog converter (SHDAC) circuits as...
In this paper, we present a timing model of flit transactions for primitive components in irregular interconnect fabrics. Based on the model, we then develop a fast timing simulator for on-chip interconnects with a 100% cycle accuracy when validated with an industrial RTL implementation. With our timing simulator, we can design and optimize architectures of interconnects by adjusting topologies, FIFOs,...
To fully support the partial reconfiguration capabilities of FPGAs, this paper introduces the tool and API BitMan for generating and manipulating configuration bitstreams. Bit-Man supports recent Xilinx FPGAs that can be used by the ISE and Vivado tool suites of the FPGA vendor Xilinx, including latest Virtex-6, 7 Series, UltraScale and UltraScale− series FPGAs. The functionality includes high-level...
Communications systems make heavy use of FPGAs; their programmability allows system designers to keep up with emerging protocols and their high-speed transceivers enable high bandwidth designs. While FPGAs are extensively used for packet parsing, inspection and classification, they have seen less use as the switch fabric between network ports. However, recent work has proposed embedding a network-on-chip...
Recently, energy expenditures of the Internet have increased dramatically, raising energy issue of routers an urgent problem in relative research areas. In fact, much device surplus and redundancy are introduced during network planning for rarely appeared traffic peak hours and device failures, wasting energy most of the time. In this work, an energy-aware architecture is proposed for routers, which...
A Field Programmable Gate Array (FPGA) family was designed to match a programmable fabric die built in 14nm process technology with 28Gb/s transceiver dice. The 2.5D packaging (Fig. 3.3.1) uses embedded interconnect bridges (EMIB) [1]. 20nm transceivers were reused enabling a transceiver roadmap independent of FPGA fabric. Fig. 3.3.2 shows a 560mm2 fabric die and six transceiver dice. The programmable...
This paper discusses an all-digital clock skew measurement architecture using a sigma-delta technique with subsampling. The skew between two remote nodes is first detected and amplified using subsampling. Since the information signal is a slow varying clock skew, subsampling the input clock signals actually over samples the input information for this case. This concept is utilized by following the...
Utilization of multi-bit flip-flops(MBFFs) in a synchronous design has been becoming a significant methodology for clock power reduction. In this paper, given a synchronous system with a set of 1-bit flip-flops in a placement plane, the timing constraints of the associated signals on the flip-flops and the available MBFFs in a cell library, firstly, based on the timing constraints of the signals on...
This paper introduces a scalable hardware design to accelerate the maze algorithm for VLSI routing on Cellular automata (CA). The time-complexities of wave-propagation and back-tracing on CA are both O(n) while constant time for label clearing. Innately high parallelism of CA largely reduces the runtime in wave propagation and label clearing. The RTL implementation for this design has been developed...
Skew optimization is an important stage of the physical design. Previous studies suggested various skew optimization algorithms [1–7]. However, many of them have only focused on the zero-skew optimization [1–3], and several recent studies focus on a useful-skew optimization [5–7]. In this paper, we propose a novel skew optimization method for useful-skew implementation. Our proposed method generates...
Network-on-Chip (NoC) has emerged as one of the main communication structure suitable for the interconnection of processing and other IP cores of a system-on-chip (SoC). An NoC typically utilizes virtual channels (VCs) to improve wormhole routing among the SoC cores by enabling multiple data packets to share a communication channel and to avoid deadlocks. Dynamically allocation multi queues based...
This paper presents an FPGA architecture capable of implementing relative timing based asynchronous designs. Modifications are made to a traditional synchronous FPGA architecture to make it asynchronous capable, while retaining its capability as a fully functional synchronous FPGA. Such a design permits multi-frequency implementations. A test FPGA fabric is developed and evaluated with the implementation...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.