This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.
DesignCon 2020 Schedule Builder
Welcome to the DesignCon 2020 agenda and presentation download site. Here you can view and download conference and/or Chiphead Theater presentations before, during, and after the event. If you’re looking for a presentation from a specific session that you’re unable to find here, it is likely because the presenter has not provided permission for external use or has not yet shared their presentation with us. Please check back after the event for a more complete catalogue of available presentations.
This power intergrity hands-on boot camp combines simulation and measurement. See how the gigabit SI world of IoT, automotive, cloud server products, etc. with the demand for lower power and multiple power rails is driving new paradigms for flat impedance and not just a maximum target Z. Start with understanding how to define a PDN impedance mask for a given load transient and package/die model. Learn how to build the behavioral model for a dc-dc converter so that both large switching transients as well as small signal load steps can be simulated when delivering power to a high-speed digital load. Step through optimizing the decoupling for an FPGA DDR4 example and then run the full PI eco-system simulation to look at the signal integrity when driven by a switching regulator power delivery network.
Note that this boot camp will break from 11:50 AM to 1:30 PM for the keynote and lunch.
Each section will include access to hands-on simulation using 1 of 40 supplied laptops or access to a cloud-based version using an attendee’s personal laptop. Measurement demo’s will also be included for the VRM modeling sections. The boot camp will be broken down into the following 5 sections:
Evolutionary changes in Power Integrity that are driving reduced margins and the need to find the worst case power delivery network noise ripple. Understand the basics of flat impedance design to avoid the risk of rogue voltage waves and at the same time reduce component count. This is critical for high current power rails, low noise SERDES power rails, and high dynamic single ended switching for DDR memory.
Power Integrity Z Mask and the regions of operation – Learn how to put together a realistic Z Mask for PDN design leveraging the information from the load transient and the package/die model.
Simulating the Power Integrity Eco-system requires more than just an R-L model for the VRM. Learn how simple control loop state space models of dc-dc converters can be built from a few measurements to enable simulations of both small signal and large signal power rail transients. The resulting model also works for optimizing PSRR and exploring the loading of multiphase designs.
Optimizing decoupling capacitors for flat impedance design requires the right models to enable simulation to measurement correlation. Understand the difference of Capacitor models with parasitics for use in SPICE like simulations, and without PCB mounting parasitics for use with EM models. Use the correct models to step through the process of estimating the required decoupling capacitance for flat impedance and then running a simulation optimizer to select the best combination of commercially available values.
Utilize all the skills learned to design a flat impedance network for an FPGA and DDR4 design example. Analyze an existing design to find the worst case ripple load.
Data rates for communication and computers continue to increase tremendously due to the ever-increasing demand. Accordingly, high-speed I/O technologies must constantly advance to meet the evolving application requirements in terms of functionality, performance, power, and cost. From the architecture point of view, technology advancement occurs in four key fronts for multiple-Gbps I/O links, namely signaling and modulation, clocking, signal conditioning/equalization, and forward error correction (FEC).
The requirements for clocking, clock and data recovery (CDR), equalization, stringent jitter, noise, and BER (e.g., 10^-12, 10^-15) for non-FEC links impose unprecedented technical challenges and opportunities for designing and verifying high-speed link systems (e.g., TX, RX, ref clock, and channel). At 56-112G Gbps, PAM4 emerges as the main-stream modulation and FEC usage has become a must, and this will pose new design challenges and trade-offs, as well as test and validation challenges.
This tutorial gives an overview on the high-speed I/O link technology trends, common and leading link architectures, modulation-format (e.g., NRZ, PAM-N), clocking, equalization, and FEC methods, validation requirements and methods, for link systems, TX, RX, channel (CH), and ref clock, with an emphasis on application examples for from 10 to the latest 112 Gbps standards. Specifically, we will:
• Review the high-speed technology trends with data from technology and standard roadmaps (e.g., Ethernet, Fibre Channel, CEI/OIF), as well as the UI shrinkage implications for jitter, and maintaining or increase SNDR implications for noise, as data rate increases.
• Review the high-speed I/O architecture advancements, modulation-schemes of NRZ and PAM-N, clocking (e.g., data driven versus common clock), CDR methods (PLL versus PI), equalization methods (Linear CTLE/FFE, DFE), FEC methods (RS, Turbo Code) optimization, tradeoff strategy (TX versus RX, digital versus analog, linear versus adaptive, latency vs performance), power optimization, and process advancements (28, 20, 14, 10, 7nm).
• Review the jitter, noise, SNDR, non-linearity/level mismatch, return-loss (RL), signal integrity, and linear system fundamentals; latest jitter and noise components and associated "family tree" hierarchy; latest view for jitter and noise "holography" in three dimensions of time, frequency, and statistical domains; jitter, noise, and BER (JNB) interrelationship.
• Review the simulation/modeling technologies used for high-speed I/Os. Both circuit/IC-based simulation methods/tools (e.g., SPICE/SPICE-like, IBIS), and behavior or IBIS-AMI based statistical/full-wave/hybrid methods/tools will be reviewed and compared. Three emerging and critical topics will be covered, including: IBIS-AMI models for ADC based RX to generate a PAM4 eye diagram where there is only voltage samples and no edge transitions; physical and FEC co-simulation meeting end-to-end BER goals; model correlations with silicon measurements over PVT.
• Review advanced verification methods for test interface de-embedding, pre-emphasis/de-emphasis-induced jitter de-embedding. Emerging topics on how to test receiver equalization and clock recovery with on-die scope for characterization and diagnostic/debug, how to test PAM-N signaling, how to test a link with FEC, especially in the presence of burst error, will be discussed.
• Review the jitter, noise, signaling, SNDR, and BER test requirements and methods for leading high-speed standards in 10 to 56 Gbps per lane (e.g., 10 GbE, 40/100, 200, 400 GbE, OIF/CEI (11G, 28G, 56G), Fibre Channel (16G, 32G, 56G), and PCIe (8G, 16G). For 56G and 112G PAM4, frequency-domain return-loss and time-domain effective return loss (ERL), and their relationship, will be discussed.
• Forward-looking discussion for some of next-generation high-speed technologies and standards such as PCIe Gen5 (32G), PCIe Gen6 (64G-PAM4), CEI-112G XSR-PAM4 (for die-to-die, die-to-OE), VSR-PAM4, MR-PAM4, and LR-PAM4, and associated jitter, noise, SNDR, ERL, and signal integrity challenges; Fibre Channel 64G-PAM4, 128G-PAM4, Ethernet and OTN 400G using 50G and 100G per lane C2M PAM4, C2C PAM4, and Backplane/Copper Cable-PAM4 SERDES, and devices with integrated OE for optical backplane to 2 km reach at Nx100 G rates.
There are a few key challenges in the industry on how to accurately measure PCB and related interconnects at high frequencies:
• Most instruments, such as VNA (Vector Network Analyzer), TDR/TDT (Time-Domain Reflectometry/Time-domain Transmissometry) can make good measurements at the end of a coaxial interface. However, test fixtures need to be inserted between an instrument's coaxial interface and the Device Under Test (DUT) (PCB, package, connector, cable, etc.). There are various de-embedding approaches already commercially available, however, the de-embedding algorithms are often proprietary, and verification of the accuracy of the de-embedded S parameters is left to the user.
• A poorly designed test-fixture can lead to inaccurate de-embedded S- Parameters. An IEEE standard is needed to specify the electrical requirements of a properly designed test fixture to achieve high quality de-embedding.
• The quality of measured DUT S-parameters can vary widely. There is no IEEE standard to check and validate the quality of S-Parameters before they are distributed for use. This has created many complications for engineers who are utilizing the measured S-Parameters for high speed interconnect analysis.
The standard development work started in 2015, and in 2018 we successfully hosted an IEEE P370 plug-fest at DesignCon. Since then, the standard draft has gone through a few rounds of improvements and revisions, and we are currently in the comment resolution stage of standard development at the time of this writing.
The standard is expected to be finalized by end of 2019, and we propose doing a 3-hour long tutorial to educate attendees on the standard. We plan to cover the following topics, with contributions from top industry experts from various areas:
• Introduction to the P370 standard. (15min) Jay Diepenbrock / Xiaoning Ye
- Introduction to De-Embedding
Why is De-embedding critical but also hard to do accurately
De-embedding methods (generic)
2x-Thru (in detail)
-P370 development overview
• Test Fixture Design Criteria (20 min) – Heidi Barnes
- Overview of Fixture Design Criteria
- Explanation of the Fixture design criteria – Fixture Electrical Requirements (FERs)
• De-Embedding Verification (20min) – Eric Bogatin
- Algorithm verification:
Using simulated data
Using real hardware (Plug-and-Play Kit)
- De-embedding Verification – Consistency Test
• S-Parameter Integrity (20min) – Mikheil Tsiklaurim
- Initial test of quality: Passivity, Causality, Reciprocity
- Application specific S-parameter integrity check.
• IEEE P370 Briefcase (40min)
- Fixture electrical requirements (Jason Ellison)
- De-embedding verification and S parameter Integrity (Se-Jung Moon)
• ATE PCB Test Fixture Example (30min) (Jose Moreira)
• High Speed connector example (30min) (Ching-Chao Huang)
Join this panel of experts for a lively panel discussion on the state of testing practices for high-speed networking technologies. This year's panel will be evaluating the pros and cons of characterization efforts for 400GbE over PAM4. Chip experts will discuss design verification challenges while the test & measurement industry veterans will provide direction on testing implementation plans. Come prepared to engage in the discussion!
The USB Type-C connector has significant adoption with standards like USB, DP, and Thunderbolt. This session will look at preparing to test the next-generation variants of USB – USB4. USB4 will transmit and receive on all 4 lanes of the Type-C connector, with bonded rates of 40Gbps in each direction for a 80Gbps link. As these signals get sent through even longer passive cables, specialized Transmitter and Receiver techniques are required to preserve Signal Integrity. We will review these new Equalization requirements, repeaters, multi-level Signaling technologies, new measurement methodologies, and ultra-low noise test instrument to properly characterize these very high-speed signals.
Emerging new applications such as HPC(high performance computing), AI, autonomous vehicles, and high-speed networking require extremely high memory bandwidth and low power efficiency. HBM2 and GDDR6 memory interface are suitable candidates for the trend that requires such an enormous memory bandwidth. GDDR6 as offering a 2x increase in per-pin bandwidth over GDDR5, while maintaining compatibility with the established GDDR5 ecosystem. HBM2E extends data rate from 2.4Gbps to 3.2Gbps result in higher data transfer. In this paper, we present silicon proven GDDR6 speeds reaching over 16Gb/s and beyond and 3.2Gbps HBM2E memory interface systems. To ensure over 16Gbps GDDR6 platform, we utilize unique equalization capabilities that are challenging to model and simulate. Also various measured results will be shown as enabling reference for other GDDR6 platform designers. For 3.2Gbps low power HBM2E platform solution, all digital DLL for DQS-DQ centering and command bus training(CBT), auto DQS cleaning and READ/WRITE training will be introduced.
Though standard DRAM speeds have continued to increase, focus has been primarily on density often at the expense of bandwidth. GDDR has taken a different path, focusing on high bandwidth. With GDDR6[1,2] speeds reaching 16Gb/s and beyond, it is critical to have designers that are well planned, simulated and implemented. This paper will provide best practice and techniques of how modern high bandwidth memory system designers should care for single-ended signal and power integrity, as well as utilize GDDR6 DRAM features that help reliably achieve these high data rate platform. To achieve GDDR6 interface speed beyond the 16Gbps/pin, various bandwidth improvement techniques and P/SI studies based on channel analysis will discussed. To achieve over 16Gbps interface speed, ISI minimized tunable equalizer scheme that can support either de-emphasis or Gm-boosting circuit feature is used in driver side. And newly proposed X-talk cancellation technique will be discussed. Area optimized per bit offset calibration scheme is used to improve signal integrity, channel analysis results considering various off-chip conditions including discrete package will also be presented in the paper. The measured WCK clock duty was within 47~53% at 16GHz including process variation and peak-to-peak periodic jitter was less than 10ps.
Another topic to discuss is new High Bandwidth Memory(HBM2E) platform. HBM2E to deliver a 3.2Gbps data transfer speed per pin, which is 33% faster than the previous generation HBM2. HBM2E has a density of 16Gb per die, double the capacity of the previous generation. With these improvements, a single HBM2E package will offer a 410GB/s data bandwidth and 16GB of memory. To enable HBM2E memory, all digital based DLL for DQS-DQ centering and command bus training(CBT), auto DQS cleaning and READ/WRITE training will be introduced. Also low power feature and known good die(KGD) test feature such as post-silicon programmable pattern and direct access test/debugging scheme will be discussed in the main paper. These industry-leading GDDR6/HBM2E platforms will enable enhanced solutions for next-generation data center, artificial intelligence, machine leaning, and graphic applications.
A major aspect to consider in 112 Gb channel design is the effect of PCB manufacturing variation on channel properties. Since 112 Gb signals have a significant spectral content up to 56 GHz and a noticeable spectral content up to 75-80 GHz, it is important to maintain acceptable channel characteristics at these frequencies to ensure proper operation. Unfortunately, controlling channel properties becomes more complex as frequencies increase and small changes in geometry or material properties have a bigger impact on channel performance. Unlike in previous generations, design robustness plays a key role in successful 112 Gb channel design; some design works well on paper but fails once manufactured. In this work we introduce some of the manufacturing aspects to consider while designing a 112 Gb channel and demonstrate the impact on channel performance by analyzing the post manufacturing data gathered from the various fabrications of a case study 112 Gb system.
Design robustness of high-speed channels played a secondary role in previous product generations. It was a standard practice for SI engineers to aim for the nominal case in the channel design process and keep a healthy margin from the spec limits or other boundaries. Little to no attention was often given to manufacturing tolerances and secondary channel effects such as mode conversion, dispersion etc. It was fairly assumed that the uncertainties in the channel performance derived from the uncontrolled factors of the design process would be well contained, and the variation across the manufactured channel population properties would be well confined within the allocated margins for a given design space. Those assumptions hold true as long as the analog frequency content of the transmitted signals are in the low enough frequency band, below 20 GHz, consequently the channel properties in the higher frequency were of little interest to SI engineers. However as bitrates went up and reached the 25 Gb mark, the analog bandwidth of the signals went up as well, to 33 to 40 GHz with a noticeable signal content. The increase in signal analog bandwidth meant that channel properties at the corresponding frequencies became equally important. The 112 Gb systems in practice often produce signals with a noticeable frequency content up to 80GHz. Maintaining the channel properties in these higher frequencies becomes much more challenging. As a result, the design assumptions, strategy, and BKMs used for previous generations of products do not hold.
It has become apparent that at this higher frequency range the variation of channel electrical properties is much greater and becomes a dominant factor affecting system performance. Furthermore, channel parameters that once played a secondary role, such as mode conversion or losses due to conductor roughness, dominate the channels behavior at the high frequencies. Therefore the manufacturing tolerances and other design aspects that were once neglected cannot be ignored, otherwise resulting in uncontrolled channel variation and unpredictable electrical behavior for a manufactured channel of a 'similar' design, similar in a sense of legacy BKMs and design practices.
The transition to 112 Gb systems introduced new challenges for channel design and modelings, which were not apparent to SI engineers or industry experts at first glance. It is becoming obvious that new channel design practices need to be adopted, and design robustness plays a leading role in ensuring proper operation of SerDes systems. Furthermore, the manufacturing considerations cannot be ignored and should be considered at every stage of the design process.
In this work we demonstrate several case studies of robust design and optimization for some of the common PCB channel segments such as: various types of transmission lines, connectors footprint, DC blocks, and via structures. The manufacturing considerations along with implementation techniques are discussed for each case. We provide an insight on the common pitfalls in the BKM's used today across the industry for the design and modeling of high speed channels. The impact on channel electrical properties variation of several PCB manufacturing aspects, such as: plating and etching of high speed traces, isotropic weave speeding, conductor roughness, registration, solder mask thickness, copper balance, skip layer referencing, via backdrilling, connectors misalignment etc. is discussed, and the tools to meet the 112 Gb channel challenges are provided, along with an introduction to robust channel design practices.
The identification of jitter and noise sources is critical to debug failure sources in the transmission of high-speed serial signals. With ever-increasing data rates accompanied by decreasing jitter budgets and noise margins, managing jitter and noise sources is of utmost relevance. Methods for decomposing jitter and noise have matured considerably over the past 20 years; however, they are mostly based on time interval error (TIE) measurements. For example, impairment models such as the dual-Dirac model use the TIE values for curve fitting. However, by transitioning from a signal- to a TIE-centric view, a significant portion of the information present in the input signal is lost.
In this paper, we first introduce a parametric signal model for serial pulse-amplitude modulated (PAM) transmission that includes jitter and noise contributions. The key to this model is a set of step responses, which characterizes the deterministic behavior of the transmission system, similar to the impulse response of traditional communication systems. Then we propose a joint jitter and noise analysis framework that takes into account all information present in the input signal. This framework relies on a joint estimation of model parameters, from which we readily obtain the commonly known jitter and noise components and thus the desired decomposition. Therefore, we provide a single mathematical base yielding the well-known jitter/noise analysis results for PAM signals and thus a consistent impairment analysis for high-speed serial transmission systems.
Additionally, we provide deep system insight through the introduction of new measurements, such as what-if signal reconstructions based on a subset of the underlying impairments. These reconstructions enable the visualization of eye diagrams for a selection of jitter/noise components, thus allowing informed decisions about the relevance of the selected components. Similarly, we determine selective BER extrapolations allowing for a fast calculation of (selective) peak-to-peak jitter and noise amplitudes at low error rates.
The proposed framework has the inherent property of being able to perform accurate measurements even for relatively short input signals. This is due to the significant increase in information extracted from the signal. Furthermore, our approach has no requirements regarding specific input symbol sequences, such as predefined compliance patterns. On the contrary, random or scrambled input data typically encountered in real-world scenarios is ideally suited to the framework.
To summarize, there are several benefits to our approach: First, we provide additional, previously unavailable, measurements. Second, we require shorter signal lengths to achieve the same accuracy as state-of-the-art methods. Finally, our method does not rely on any specific symbol sequences. As part of the paper, we show exemplary measurement results as well as comparative studies with state-of-the-art methods.
Conventional multi-connector cables and adaptors solution for Head Mounted devices (HMD) are too cumbersome to meet Virtual Reality (VR) users' desires for portability and simple device setup. A new alternate mode to USB Type-C specification is proposed which combines a high-bandwidth display (DP HBR3) and data (USB3.1 SSP) to a single USB TypeC connector to accommodate growing display resolution and bi-directional high-speed data transfer. USB3.1 SSP shares the center pins of Type-C connector which were designated for USB2 in traditional alternate mode to ensure system compatibility. However, the multiplexing scheme between different specifications poses tremendous amount of signal integrity challenges to the channel. The challenges become more sever when considering various channel topologies which may include different type of active devices along the channel. In this paper, a comprehensive discussion of the new alternative mode link design, from system architecture, link budget, design challenges, optimization, to final implementation is presented. First, the overview of VR system is illustrated to discuss the requirement of architecture. Next, link budgeting and optimization are discussed. Various measurements and simulations on each critical block along the channel are presented to illustrate the trade-off between different topology proposals. Finally, complete functional validation and compliance test based on real product bring-up are presented.
DDR5 and LPDDR5 memory technologies will increase system data rates to 6400Mb/s and possibly higher. New design, debug, and validation challenges arise from higher data rates, DFE (decision feedback equalization), dynamic speed changes, power saving features, and lower voltage swings. Learn the latest test and measurement techniques to overcome DDR5 and LPDDR5 challenges.
The design solution space for high-speed serial links is becoming increasingly complex as data rates climb, channel topologies become more diverse, and tuning parameters for active components multiply. PCI Express 5.0, at 32 GT/s, and related protocols such as Compute Express Link (CXL) are particularly relevant examples of applications whose design solution space can be a daunting problem to tackle given the low-cost nature of their end-equipment. This paper is intended to help system designers navigate through these design challenges by providing a how-to guide for defining, executing, and analyzing system-level simulations including PCIe 5.0 Root Complex (RC), Retimer, and End Point (EP).
The subdivision of the extremely lossy channels between RC and EP with a Retimer is a common practice and can present opportunities for lowering system cost, while extending its channel reach. However, it complicates the overall system design and presents challenges in examining its solution space. The methodology presented in this analysis focuses on the selection of printed circuit board (PCB) material, placement of the Retimer device, and analysis of end-to-end channel performance in the context of the PCIe standard.
The approach is based on IBIS Algorithmic Modeling Interface (IBIS-AMI) simulations. IBIS-AMI's standardized interface offers interoperability between models provided by different integrated circuit (IC) vendors. More importantly, critical component-level impairments such as jitter, bandwidth, and adaptation consistency can be represented in IBIS-AMI models and reflected in the overall link performance—effects that a simple s-parameter analysis of the passive interconnect fails to capture. For the purposes of this paper, models of a worst-case PCIe transmitter and receiver are used for the RC and EP. A Retimer is used between the RC and EP to achieve reach extension and its placement and performance is studied. IBIS-AMI simulations are performed to test the performance of the overall system. The methodology outlined here can be extended to any RC, Retimer, and EP device; and it can be performed with any IBIS-AMI simulator.
The proposed approach for simulating the solution space of a RC+Retimer+EP system in the context of PCIe Gen-5 is as follows:
1. Determine if a Retimer is required
2. Define a simulation space
3. Define evaluation criteria
4. Execute the simulation matrix and analyze the results
The goal is to reach a conclusion regarding the optimum configuration of the system in an efficient and timely manner.
IBIS-AMI models have become a critical piece of modern high-speed SerDes designs. Where an IBIS-AMI model was once an after-thought created months or years after the silicon arrived, customers commonly request pre-layout models before the design is even sent to the foundry. To achieve this, the IBIS-AMI modeling creation process needs to be integrated into the high-speed SerDes model-based design flow that is both accessible to SerDes architects and signal integrity engineers alike.
We have proven out a new IBIS-AMI design flow by integrating MathWork's SerDes Toolbox with detailed simulation characterization data of an Intel 56G PAM4 SerDes. The receiver is composed of an N stage CTLE, an M tap DFE and a bang-bang CDR with a tunable bandwidth. Each CTLE stage is modeled with a rational transfer function (pole/zero) and a memoryless-nonlinearity that both vary across numerous corners and DC gain/ peaking gain settings. The M tap DFE models the discrete tap limits and include the setting error impairment. The transmitter includes an FFE which includes realistic circuit limitations of tap granularity and range.
The streamlined flow addresses the usual problems of managing lots of characterization data across corners and control setting by utilizing a uniform data format. The CTLE transfer function responses are processed with a vector fitting algorithm which uses a common set of poles for each family of curves which maintains the continuity expected between related responses. Each family of poles and zeros are then accepted by the SerDes Toolbox CTLE Simulink block and the memoryless-nonlinearity response is similarly accepted by the Simulink SaturatingAmplifier block.
The IBIS-AMI model provides both impulse based processing and sample-by-sample analysis to enable BER analysis based on both AMI_Init and AMI_GetWave simulations. The model includes a multiphase adaptation process for both the CTLE and DFE where future work will include co-optimization with the Tx FFE.
The resulting architectural abstraction model offers a 250 to 1000 times faster performance than the detailed structural model with comparable accuracy, enabling rapid system level channel analysis. This template flow can now be leveraged for future designs and IBIS-AMI model creation, enabling a model-based design flow readily available to both SerDes architects and signal integrity engineers.
In high-speed link timing budgets, power supply noise induced jitter (PSIJ) needs to be accurately analyzed because it can predominantly affect the timing closure of digital blocks or the timing budget of high-speed interfaces. The accurate evaluation of the impact of PSIJ is critical more than ever because the supply voltage is shrinking, the unit interval is getting smaller for both digital core and analog blocks, and the complexity of mixed-signal clock/data paths is increasing. Previously, several PSIJ analysis methodologies are developed in time and frequency domains. By using the DC delay sensitivity of circuit blocks, frequency-dependent jitter transfer functions are calculated and verified with SPICE simulation for each digital or mixed-analog block in isolation. However, PSIJ of an entire clock/data path from PLL output to output buffer in transceivers has not been verified over single or multiple supply domains due to the complexity of the analysis required to cover several digital and analog blocks. In this paper, we analyze PSIJ of several key circuit blocks with transistor-level SPICE simulations and compare the results with the empirical PSIJ methodologies. The accuracy of the PSIJ methodology of the overall system is verified against the combined results of multiple circuit blocks when analyzed separately. The required interaction needed to be captured between multiple blocks under multiple power domains is studied to remove the limitation of the traditional PSIJ methodology. To accurately analyze the entire clock and data path, all the transistor-level post-layout netlists of digital blocks are extracted, and the accurate power supply network models of self-impedance and trans-impedance are used to connect the multiple circuit blocks and buffers. In this work, we used an FPGA DDR4 PHY system having programmable digital logics as well as clock paths that can be used to verify the PSIJ analysis methodology. The transmitter path consists of clock trees, phase interpolators, multiplexers, pre-drivers and output buffers. System-level power delivery network (PDN) is modeled with chip power models, package and board PDN scattering parameters. Digital core noise, which is modeled with single-tone sinusoidal waveform, is used to characterize the frequency-dependent jitter transfer function. By sweeping the noise frequency and measuring the induced clock/data jitter, the jitter transfer function can be obtained and compared with empirical PSIJ methodology. With varying noise amplitude, the nonlinear effect of delay sensitivity is investigated to make sure that the small-signal linear characteristic assumption in the empirical PSIJ methodologies is still valid. With the transistor-based circuit simulation, it is straight forward to consider PSIJ due to self-generated (local) and coupled (global) noise effects in multiple power domains. In summary, with the SPICE simulation of clock and data paths, the accuracy of the empirical PSIJ analysis methodologies is examined for complete systems in multiple power domains. Since the empirical approaches intrinsically have several assumptions about the underlining equations, the methodology needs to be carefully applied. The paper clearly defines the limit of the empirical PSIJ methodologies and usage cases and the required additional parameters needed to make the methodology hold for wide applications.
For high speed IO applications, as speed goes 16Gbps, 32Gbps and higher, the insertion loss requirements on PCB become more and more critical. Maximum allowed length becomes shorter and at the same time PCB material is better with higher cost. Specific for a PCI Express 2-connector topology from PCI Express 3.0 to 5.0, a maximum allowed board routing length is reduced from 16 inches to 10 inches while PCB material is upgraded from middle loss to ultra-low loss [1, 2]. Designers are willing to explore enablers to extend the solution space and minimize overall cost.
In this paper, an internal transmission line simulation tool is applied, and the results are correlated well with the real PCB loss measurement result. The tool embeds key parameters of copper, core and prepreg of 83 industry common PCB materials from middle loss to ultra-low loss. A simulation plot of loss versus materials is shown in a specific 3-mil core stackup geometry. Among the result, 15 of 40 middle loss materials, 5 of 17 low loss materials and 19 of 26 ultra-low loss materials are failed to meet defined middle loss, low loss and ultra-low loss specification. This means that the loss number is a function of not only PCB material but also other parameters like stackup and geometry, etc.
Three enablers based on stackup and geometry are analyzed for stripline, which are copper thickness, core and prepreg thickness, and differential routing trace width and spacing. In PCI Express 5.0 application, with enabler 1, loss impact study are performed on 0.5 oz and 1.0 oz copper for a given stackup and material, around 0.1 dB loss per inch (10%) improvement is observed on a ultra-low loss material at 16GHz; with enabler 2 and 3, 0.09-0.2 dB per inch (9-20%) loss improvement opportunities are seen for different combinations of core/pp thickness and routing pitch, which is 1.0 to 2.6 inches length extension among all xx ultra-low loss materials.
A case of actual board routing is studied for PCI Express 5.0 application. PCB insertion loss based on 3-mil core and 11.4 mil pitch is used as the baseline. It is found that the loss difference between 3-mil to 5-mil core is 0.21dB per inch at 16GHz. Pitch-varied routing is considered to balance the loss and routing space of 336 differential pairs, where 14 mil pitch routing is applied to long pairs which is around 20% of all pairs; 8 mil pitch routing is applied to the rest of 80% pairs. As a result, 1.0 inch routing extension is achieved from 3-mil to 4-mil core; another 1.8 inches solution extension from 4-mil to 5-mil core with board thickness from 100 to 104 or 108 mils. Pair to pair spacing is also studied with different pitch values and we finally achieve a minimized pair to pair spacing after increasing the core thickness while still maintaining equal or less far-end crosstalk level.
These techniques used in this paper can be easily applied to other loss sensitive high speed designs.
Continuous increase in the performance and speed of computer systems has led to more frequent field failures of memory systems. As a result, accurate measurement and modeling are becoming of utmost importance for memory vendors in order to guarantee the electrical reliability of their products. However, conventional on-chip measurement methodologies such as probing on the decoupling capacitor or a use of a test interposer between chip pads and measurement points. In this paper, we propose a novel method of measuring Signal Integrity and Power Integrity (SI/PI) performances of DRAM operation by directly probing on top of DRAM package. With this proposed test package, we prove the effectiveness of monitoring on-chip operation through simulation and measurement and hence, effectively enhance the level of SI/PI modeling and predictability of memory systems.
Technologies such as 800G Ethernet (IEEE 802.3ck, OIF CEI-112G) are driving I/O interconnect technologies such as PCI Express to 64GT/s in the 6th generation of that standard. Concurrent computing needs which are elevating coprocessor components to the same computer architectural hierarchy as the CPU are being developed in the Compute Express Link (CXL) and Cache Coherent Interconnect for Accelerators (CCIX) Consortiums and are reaching speeds of up to 32GT/s and looking to go even faster. With this increase in digital transmission speed, the increase in throughput is accompanied by significant signal integrity challenges related to transmitter signal quality, connector crosstalk, receiver jitter sensitivity, and overall channel insertion loss around the Nyquist frequency at which each of these standards operate. In this session, we will bring you the latest information on what Keysight is doing to help develop standards like PCI Express 6.0 and other similar standards as far as physical layer testing including transmitter, receiver, and channel testing. In addition, we will describe the latest approach to achieving compliance that uses the same software tools you use for device characterization but using data provided by simulation instead of physical measurement. The simulation mimics a real hardware test bench, and it emits the same waveforms that the oscilloscope app expects when testing in the lab. This allows you to verify the pre-manufacture simulated design with the actual post-manufacture prototype.
1.The insertion loss and impedance may deviate the designed target if the modeling only relies on spec sheets data. The deviation is a disaster especially for the 112Gbps-high-speed application. Therefore, the actual DK, DF, and copper foil roughness needed to be obtained. A more accurate method to extract PCB material characteristics is proposed.
2.Based on the delta-L method proposed by Intel, some algorithms are optimized. Firstly, the de-embedding algorithm is optimized based on AFR algorithm. The algorithm considers the impedance fluctuation of the de-embedded trace and mode-conversion effect during differential de-embedding. Then RLGC model of trace is optimized, where the resistance equation fitted with geometry is optimized according to the simulation. In addition, the SMA breakout is simulated and some optimized layouts are proposed.
3.Considering many factors interaction, this paper innovatively uses a multiple local iterative optimization algorithm to find the optimal variables, and in this algorithm, Euclidean distance is used as a measure of similarity to find the final extracted material parameters.
4.Besides DK, DF and roughness, skew is also a critical factor for PCB characteristic evaluation, especially in the future 112Gbps system with about 18ps per UI. In order to reduce test time and cost, a skew probability evaluation method is proposed in this paper which considers multiple PP lamination, PCB rotation, trace position, and etc.
5.A tolerance design method of key parameters is proposed to reduce the diversity of the PCB products. In addition, the significant factors are determined by ranking. Finally, the optimization goal of improving the product consistency is achieved by the tolerance redistribution.
This paper focuses on the radiated emission failure scenarios for boxed high-speed systems containing PCIe interfaces. First, the source of unwanted radiation from a hypothetical PCIe board is investigated to understand the interconnect and signaling design parameters that contribute to the generation of unwanted radiated emission. Parameters such as rise time and the timing skew in differential pairs and their impacts on the resultant common mode signal are explained.
It is noted that even with applying the best signal integrity practices in board layout design, sometimes compliance failure occurs at a certain frequency in over-the-air radiated emission scan. Fabric-over-foam gaskets, fingerstock, or surface mount shields are inserted in the system box where possible or other shielding methods such as reducing the aperture size of the air vents are commonly employed but they are not always effective. In this case and at this late stage of product development, the other solution is redesigning the system which is very costly and need to be avoided. Thus, a different solution is needed that should be effective at the compliance failure frequency without requiring any changes to the system board and component design. A Frequency Selective Surface (FSS) absorber that can be inserted in the system shield box provides such solution as proposed in this paper.
This custom-designed absorber is a 2-D array of resistive patch elements in a two-layer PCB structure that acts like an impedance transformer to match the almost 0-ohm impedance of the metal shield to wave impedance of 377 ohm at the frequency of radiated emission failure. The design methodology for this FSS absorber which can be categorized as a metamaterial structure is explained in this paper. Full-wave simulation results of the standalone FSS absorber show its surface impedance matches the wave impedance at 8 GHz. For a more practical study and to understand the effectiveness of this absorber when placed inside a high-speed system box a sample system board emulating the emissions of a PCIe Gen 3 interface is designed. This mock-up test board comprises a 3x3 array of patch antennas to create different scenarios of radiation sources of a high-speed system board. Radiated emission from the encased high-speed system board is observed with and without adding the FSS absorber in the shield box.
For experimental evaluations, an in-house radiation measurement test setup is designed to measure the radiation pattern in cylindrical coordinates. Emission reduction is calculated and compared with a system box that has commercial microwave absorber lining. It is demonstrated that the FSS absorber reduces the emission by ~6 dB while the microwave absorber can only drop the emission by ~3 dB.
In our previous DesignCon 2019 paper, we have demonstrated by using principal component analysis (PCA), we can turn the complex task of SerDes tuning from a complex high dimensional problem (~30-40 dimension variables) to smaller dimensional vectors (~3-4 dimensions) that covers 85%+ of the solution space. By using a genetic algorithm, we also show the PCA tuning vectors needs significantly less steps to achieve optimization. However, there is no genetic algorithm available for most of the SerDes in practice and no universal tuning method exist.
In this paper we propose to use polynomial chaotic expansion (PCE) surrogate models created in the PCA space to represent the SerDes performance under a known channel condition. The entire channel operating conditions can be modelled by these family of PCE surrogate models. Since they are in PCA space, they are relatively simple with low order (~3-4 PCA vectors with 3rd-4th order models). The objective function of the surrogate models are the final bit error rate (BER). Once the surrogate functions are created, we can precompute the optimal settings for the best BER performance for each of the channel conditions and store them in a look up table.
In a production environment, the actual channel condition is not known but will be assumed to be bounded within the measured channel condition range. A few random points in the PCA space will be use to evaluate the actual SerDes BER under those settings. The same settings will feed into the family of PCE surrogate models to find out which model results in the closest BER as the actual channel BER results. The channel condition which correspond to the surrogate model can then be identified. The optimal settings for the identified channel condition can be found in the look up table. Those settings can be loaded into the SerDes to confirm the desire BER has been achieved. If the resulting BER meets the target, the auto tuning can be considered to be successful. If the resulting BER needs further improvement, the settings will be used as seed values for auto tuning with the SerDes tuning algorithm.
In summary, our methodology that uses PCE surrogate models in PCA space can reduce the complex high dimensional problem of high speed channel identification and tuning into straight forward equations and look up table evaluations. It can be done with measurements and no simulation is needed. Besides the tremendous speed up of tuning time, it also have the advantage of always converging to the global optimization point. The existing channel optimization algorithms always need to have a fixed seeding settings that result in convergence to local suboptimal minimum rather than the global minimum point.
As data rate increases, the role of FEC in high speed links has evolved from being "optional" to "mandatory". It is no longer practical to rely on SerDes' own capability for MR/LR, or even VSR, PAM4 applications with target BER no worse than 1e-15.
From FEC perspective, there exist multi-part links where the middle devices such as retimers and optical modules may not be able to afford to integrate FEC to perform error correction to terminate errors within each section for power and latency reasons. Thus, mixed error mechanism (random and burst errors) from each section collectively impact the final stage FEC. The scenario is much more complicated than a single-part link system.
For the purpose of end-to-end FEC performance evaluation for such a system, to apply the rule-of-thumb based on the raw BER "criteria", for example 1e-6, from experiences with single-part links to a multi-part link does not work reliably any longer, especially when FEC SG (system gain) for each link section is quite different for multi-part links, where system level impairments such as insertion loss (ISI related) or crosstalk (noise related) is presented differently. Link loss (and length) budgeting for such multi-part link system (usually with different SerDes used in each link section) becomes more critical and challenging. To the authors' knowledge, the analysis method for the above scenario does not exist in the industry.
In this paper, we introduce an end-to-end PAM4 RS-FEC performance simulation algorithmic model for multi-part link applications, where FEC is only applied at the final stage device of the link. An engineering method for analyzing the end-to-end FEC performance in a multi-part link system is discussed. Our approach and its uniqueness are highlighted below.
• The algorithmic model supports single burst error calculation for each link section, based on all possible PAM4 symbol error syndromes due to DFE mechanism. Probabilities of different numbers of FEC symbol errors induced by these single burst errors are then taken into account.
• Probabilities of all FEC symbol error patterns from each link section are combined to form FEC symbol error patterns for the final stage FEC to deal with. This approach is well compatible with FEC codewords interleaving and symbol distribution processing across multi-lane systems.
The proposed method is substantially different from existing commonly practiced methods seen in the industry, in which the pre-FEC BER is the criteria for each link section, and the end-to-end performance is measured by the final stage pre-FEC BER. The proposed method takes into account of various FEC symbol error pattern combinations that occur with certain probabilities in a multi-part link for accurate end-to-end FEC evaluations. The method can be used for LR with retimers, as well as 200GAUI-n/400GAUI-n like scenarios where interleaved FEC codewords are distributed over multiple lanes. The ultimate goal is to provide a more reliable method for end-to-end FEC performance evaluation for a multi-part system, and to help system designers to budget the channel for optimal link performance.
Cloud-based workloads are driving continued evolution in enterprise and hyperscale data centers. Critical to the successful scaling of computing and networking devices are high-speed SerDes interfaces providing high-bandwidth communications between ASICs and SoCs. This presentation will cover the SerDes solutions such as PCIe 5, CXL and CCIX that are driving new levels of performance.
Performing Link simulations is an essential step in designing and enabling high-speed serial links. In a high-speed serial link, signal is generated, transmitted, and shaped by transmitter driver circuitry, channel components, and receiver equalizers. Then, the information is recovered by a slicer circuitry with determined thresholds and timing. Likewise, noises, such as reference clock phase noise, power supply noise, circuit's thermal noises, channel crosstalk, timing noise (i.e. jitter), and quantization noise, also go through the same route and transforming mechanisms but usually at different injection locations. In serial links at <25Gbps with NRZ modulation scheme, while jitter is extensively studied and modelled, other noise components are mostly neglected because NRZ links are usually constrained by timing budget. When the serial link speed exceeds 50Gbps with PAM4 modulation scheme and moves toward ADC-based SerDes designs, the links are now constrained in both timing and amplitude budget. Accurately modeling and simulating noise components become critical in assess link margins.
This paper follows the topics and subjects presented in our previous DesignCon papers . We start with an overview of various types of noises in serial links regarding the origins of noise sources, including both intrinsic and external/system noises, and their characteristic. We will then investigate noise generation methodology and techniques which include frequency, waveform, and statistical domain approaches. With generated noises, we will then study the processing techniques when the noises travel through the link. Design experiments using real world channels and device/noise characteristics will be studied and presented. We will also compare the noise modeling feasibilities and level of accuracy in commonly used link simulation paradigms include COM and IBIS-AMI platforms. The outline of this paper is as follows:
1. Noise sources and characteristics in high-speed serial links
2. Noise modeling and generation techniques
3. Noise transformation and processing within serial links
4. Case studies on 112Gbps PAM4 serial link performance with noise sources
5. Conduct comparative studies with COM and IBIS-AMI simulation flows
Finally, we will conclude the study by looking at the findings and results from the link level and perform trade analysis on the link design.
 H. Wu, M. Shimanouchi, and M. Li, "Effective Link Equalizations using FIR, CTLE, FFE, DFE, and FEC for Serial Links at 112 Gbps and Beyond", DesignCon 2018, of Santa Clara, CA.
 H. Wu, M. Shimanouchi, and M. Li, "COM & IBIS-AMI: How They Relate & Where They Diverge", DesignCon 2019, Santa Clara, CA.
 M. Li, Jitter, Noise, and Signal Integrity at High-Speed, Prentice Hall, ISBN 0132429616, 2007.
 M.P. Li and M. Shimanouchi, "New Hybrid Simulation Method for Jitter and BER in High-Speed Links", DesignCon 2011.
 H. Wu, M. Shimanouchi, and M. Li, "High-Speed Link Simulation Strategy for Meeting Ultra Long Data Pattern under Low BER Requirements", DesignCon 2014, Santa Clara, CA.
 H. Wu, M. Li, and M. Shimanouchi, "Effects of Device Characteristics in Multi-Level Signaling Links", DesignCon 2015, Santa Clara, CA.
A modular platform architecture is proposed to validate PCIe 5.0 IPs. This platform consists of a universal baseboard (UBB), a universal power card (UPC), a universal control card (UCC), and a personality card (PC). The PC is installed on top of the UBB using board to board connectors through which power, clock, and controls signals are delivered to the silicon on the PC from UBB, UPC, and UCC. The UPC provides programmable power rails with different voltage levels and UCC with on-board FPGA provides power-on sequencing to UPC and UBB. The platform is reconfigurable and scalable with future high-speed I/Os.
Nowadays, AR/VR, 5G, IoT, autonomous cars applications drive exponential growth in data communication traffic. Most of the traffic is running in hyper-scale data centers, creating a huge demand for high-speed optical interconnects technologies, which has a high requirement on bandwidth, power consumption, cost, and footprint. Silicon Photonics technology, having potentials to meet all these requirements and been considered a promising technology for 400G and beyond data center optical interconnects, has been a hot subject for research and development in recent years.
In this work, we investigate the challenges, trade-offs, and compromises in the design and optimization for a 400-Gbps Silicon Photonic integration package. Multi-Die package design for silicon photonics chips, is a big challenge in a high-speed system, because silicon photonics chips usually have higher constraints compared to traditional electrical chips. Silicon photonic chips are more sensitive to temperature and mechanical changes. Moreover, the manufacturing process and variations can have a significant impact on the overall package performance. All these factors require a high-quality electrical design on the package.
An electrical design flow for a multi-die integration package includes material selection, stack-up design, transmission line structure and geometry design, VIA design, BGA geometry selection and so on. The high-speed links contain copper pillar bumps, transmission lines, Vias and BGA balls. This complex 3D structure associates with multiple parasitic inductance and parasitic capacitance, which causes signal integrity issues, such as impedance discontinuity, crosstalk, reflections. Based on the physical characteristics, an equivalent circuits model on VIA is generated, parasitic parameters are extracted from 3D EM simulation tools. The equivalent circuit model can quickly estimate the VIA performance and help on the optimization process. For example, decreasing the equivalent capacitance would help on the impedance matching performance. This behavior indicates that increasing unti-pad size will results in better impedance matching performance. Other optimization considerations are also well presented in the paper.
The finalized optimized signal integrity performance has been greatly improved on impedance continuity, insertion loss and reflections. For the impedance matching, the initial design has an impedance mismatching rate over 50%, and the final results after optimizations reach a mismatch rate of 3%, which presents a much better design with a target an impedance mismatching rate of 10%. These results have been analyzed in both ADS SI/PI Pro EM simulation environment and ANSYS HFSS environment. Correlations between simulated data and measured data have also been studied in this work.
This paper presents an efficient signal integrity design and optimization flow for a multi-die integration package, and also investigates the design challenge, tradeoffs and compromise considerations. The design and optimization considerations not only benefit the design cycle improvement, but help on overall system performance evaluations and debugging. Moreover, it can be used as a reference example for future 800-Gbps silicon-photonics based optical transceiver package and on-board optics design.
Continuous adaptive equalization at receiver is essential in high-speed serial link to compensate for passive channel characteristic changes due to environmental impact, such as temperature and silicon internal variations due to process, voltage and temperature (PVT).
For the application where temperature span is large and the link must be fully operational without restart or re-adaptation, continuous adaptive equalization without violating BER within the whole temperature span is a must.
Traditionally, Rx equalization combines a continuous-time linear equalizer (CTLE) and decision feedback equalization (DFE). While DFE is intrinsically capable of continuous adaptation, CTLE is inflexible where it usually can't change its characteristics after initial adaptation.
The insertion loss (IL) change due to temperature of a mainstream PCB material for 25G-NRZ/50G-PAM4 system is approximately 0.048 dB/°C per meter @12.89 GHz (dielectric loss plus conductor loss). If a serial link built with the mainstream material is routed as long as possible to fill in the 35 dB total channel IL budget defined in IEEE 25GBASE-KR, the IL change due to temperature is as large as 6 dB over the temperature span between -40°C to 90°C.
Besides handling IL induced intersymbol interference (ISI), DFE is used to remove ISI caused by reflections, which is presented as ripples on post-cursors. DFE should also compensate CTLE's characteristics variations caused by PVT variations. Furthermore, DFE instead of CTLE is deployed to handle IL induced ISI at initial adaptation stage to limit noise amplification by CTLE in the scenario of high crosstalk channel. The DFE budget reserved for compensating channel characteristic variations is degraded.
A state-of-the-art DFE can handle 3~4 dB dynamic channel loss increase with the presence of CTLE characteristics variation over temperature and voltage, which is not enough for applications with large temperature span. When system boots up, the initial adaptation optimizes CTLE&DFE based on current operation condition, an optimal equalization parameter combination at initial condition is sub-optimal if continuous adaptation can't track the operation condition change due to temperature. Thus, it's essential that CTLE is in the loop of continuous adaptation to ease the burden on DFE.
This paper proposes a novel algorithm that utilizes DFE tap values to enable dynamic CTLE parameter adjustment based on the dynamic changes of operation conditions in case CTLE is not part of continuous adaptation due to the complexity of least mean square (LMS) based method or inaccuracy of maximum eye-opening method. The method is independent with the structure of CTLE. The baseline CTLEs and DFEs defined in IEEE 25GBASE-KR and IEEE 50GBASE-KR/OIF CEI-56G-LR are employed to demonstrate the implementation of the algorithm for brevity.
The mechanism of passive channel characteristics change due to temperature and its impact on amplitude and phase distortions are presented with theoretical analysis, simulation and lab measurement. The consequence and penalty on system margin by sub-optimal equalization parameters due to incapable of tracking and compensating channel characteristic change are shown as well.
GaN has exceeded our limits of oscilloscope voltage measurements. Not because of the oscilloscope, but because of the probe transfer function limitations and because of the lack of de-embedding. An additional constraint is that most probes are designed and calibrated using a 50 Ohm signal generator. Unfortunately, I haven't met any 50 Ohm power supplies. Clearly a better method of measuring and calibrating oscilloscope probes, at GaN speeds, is needed.
In this session I'll provide two methods of accurately measuring oscilloscope probe response, one in time domain and one in frequency domain, using the oscilloscope as a high frequency Vector Network Analyzer (VNA).
For accurate NAND Flash Memory system level analysis it is extremely important to understand the NAND AC timing parameters, how they are defined, how the specifications are determined, and how to implement them accurately in the SI analysis.
The specification values for the NAND AC timing parameters are determined through silicon characterization. Hardware used in the characterization environment differs significantly from the hardware in a NAND product-level environment. Due to aggressive storage capacity requirements, multiple NAND Flash die are often stacked in a highly integrated, complex package system. NAND Flash Memory channel properties varies depending on the type of storage system (embedded or Solid State Drive application). Due to the variations in different product level system environments, it is not possible to have one representative characterization system to replicate all the actual systems.
Due to the complexity of accessing the NAND Pads of bare silicon in packaged form, NAND AC timing parameter specifications cannot be measured or defined at the NAND die pads. They are defined by the NAND BGA package output. Additionally, characterization is performed with a 50Ω termination case. In the actual use, output termination is enabled or disabled depending of the product requirement.
The power supply noise is one of the major source of the timing jitter in high speed memory interface. It directly contributes to the jitter to the critical timing sources of the system. Moreover, it affects the timing of the internal logic, clock distribution and the output driver circuitry.
Because NAND Flash Memory AC timing specifications are defined with specific characterization hardware, for accurate NAND SI analysis, the impact of the characterization system PDN on the AC timing parameters must be well understood and taken into account. This emphasizes the need for decoupling any contributing factors introduced by the characterization system from the actual AC timing parameters specification values.
Simulation results show that without the characterization system PDN, there is no jitter in the Clock duty cycle (tQSH/tQSL), system PDN contributes around 4% of jitter in tQSH/tQSL. The specification values of tQSH/tQSL include both the average duty cycle distortion (DCD) and the random jitter contributed by the PDN. In current SI simulation, the duty cycle of the input signal is directly derated by the tQSH/tQSL values. The PDN of the actual product environment , will introduce additional jitter. If the jitter portion of tQSH/tQSL specification is not excluded from the SI input, the PDN impact on the jitter will be counted twice.
PDN also contributes IO bit skew (tDQSQ).
This paper identifies a potential gap in NAND SI simulation due to characterization system PDN-induced jitter in clock (DQS/BDQS) pulse width and data bit skew. Because the impact of input signal DCD is non-linear and increases exponentially with higher operating speeds, obtaining accurate system-level performance predication depends on derating the input DCD accordingly. The jitter component in tQSH/tQSL due to the characterization system PDN and the IO model contribution in tQH DCD must be subtracted from the input DCD calculation.
Details will emerge from discussions with the proposed panelists.
Proposed Panelists: Scott McMorrow, Teraspeed/Samtec; Scott Hinaga or Jayaprakash, Cisco; Up to 4 Laminate Manufacturers (Details TBD); Don DeGroot, CCN
Moderator: Bill Hargin, Z-zero
The objective of this panel session is to discuss the need for glass-reinforced or PTFE dielectrics that can support the needs of 28, 56, 112, or 128 Gbps, along with developing a system for winnowing the list of laminate possibilities from different laminate vendors—or within the same vendor once you've chosen a laminate.
In this panel session, we'll nail down the answers to this quintessential design concern once and for all – or entertain you while some of the experts slug it out!
We will also discuss trends in laminate characterization practices for high-speed signaling.
Potential laminate manufacturers:
Alan Cochrane, TUC
Doug Leys, AGC-Nelco
Tony Senese, Panasonic
Sean Mirshafiei, Isola
Alun Morgan, Ventec
John Coonrod or Allen Horn, Rogers
As network and data center operators grapple with the challenges of increased traffic growth and customer expectations, one of the key requirements to meet these needs is a higher speed interoperable electrical interface that is standardized across the industry. Electrical interfaces at 112 Gbps are a critical enabler of faster, more efficient and cost effective networks and data centers. Already optical modules are leveraging 100 Gbps lambdas, and will benefit from migration to 112 Gbps electrical interfaces. A panel of OIF contributors will discuss the ongoing CEI-112G five electrical interface development projects, and the new architectures they will enable including chiplet packaging, co-packaged optics and internal cable based solutions. The panel will provide an update on the status of the five interfaces being defined by the OIF including CEI-112G MCM, XSR, VSR, MR and LR for 112 Gbps applications of die-to-die, chip-to-module, chip-to-chip, and long reach over backplane and cables. Listen to thought leaders in the electrical interface industry debate the issues and challenges surrounding the CEI-112G projects and the architectures they will enable.
JAE’s high performance Flat Flexible Cable (FFC) DZ20 series interconnect solution is compatible with speeds up to 32Gbps while providing advantages over standard shielded wire solutions in both cost and mechanical routing. Transmitting a wide range of high speed signals such as PCIe, SAS, SATA, USB, and others, this dual row connector is able to maintain a compact size. Ultra-thin FFC cable allows for less air flow blockage as well as easier bending and routing options not previously possible. Additionally the simplified FFC manufacturing and assembly process provides significant cost savings as compared to standard twin-ax cable assemblies.
Everyone has used a TDR. The results are simple to analyze when it's a simple transmission line built with an adjacent return plane. But what if the return plane is not adjacent?
The test structure we will explore is a simple 4-layer board with a micro strip on top and three planes underneath. What happens to the TDR response, and the impedance we measure for the signal line when the signal is launched between the signal line and the top plane, or the signal line and the middle or bottom plane? Is the plane just a reference? Does it matter between which planes the TDR is connected?
If you want to test your understanding, come to this speed training event and see this live demo and test yourself against the real world. All will be made clear in this performance.
The IEEE 802.3ck Task Force is developing physical layer (PHY) specifications for operating speeds of 100 Gb/s, 200 Gb/s and 400 Gb/s based on 100 Gb/s signaling per electrical lane. The IEEE 802.3 PHY specifications contain the transmission medium as well as the mechanical and electrical interfaces between the transmission medium. The Task Force objectives include supporting operation over electrical backplanes, with an insertion loss ≤28 dB at 26.56 GHz, and supporting operation over twin-axial copper cables with lengths up to at least 2 m; for single-lane (100 Gb/s), two-lane (200 Gb/s), and four-lane topologies (400 Gb/s).
Forward error correction (FEC) codes have become an integral part of many high-speed wireline links at data rates above 25Gb/s. Depending on the equalization techniques used in wireline links, the same pre-FEC BER may result in different post-FEC BER. Decision feedback equalizer (DFE) error propagation and other noise sources such as inter-symbol interference (ISI), crosstalk and jitter can also significantly impact the accuracy of post-FEC BER analysis. Ideally, one may perform a transient simulation to fully capture the characteristics from all noise sources. However, the targeted <10^-15 BERs make time-domain simulations prohibitively long, especially for exploring architectural design alternatives. Therefore, an efficient statistical model that accurately predicts very low post-FEC BERs serves as an essential part for the design of high-speed wireline links.
This paper presents a statistical modeling approach to accurately estimate post-FEC BER for high-speed wireline links using standard linear block codes, such as the RS(544,514,15) KP4 and RS(528,514,7) KR4 codes. A hierarchical approach is adopted to analyze the propagation of PAM-symbol and FEC-symbol errors through a two-layer Markov model. A series of techniques including state aggregation, time aggregation, state reduction, and dynamic programming are introduced making the time complexity to compute post-FEC BER below 10^-15 reasonable. Error bounds associated with each method are found. The efficiency of the proposed model allows it to handle a larger state space, more DFE taps, and more sophisticated linear block codes than prior work.
In this paper, we will turn our proposed BER estimation method into a set of tools to assist in making architectural choices for wireline transceivers, such as co-design of the equalization and FEC in the presence of DFE error propagation and various noise sources. First, the impact on FEC coding gain is investigated using various coding schemes including bit multiplexing, MOD4 precoding and interleaved FEC codes. Behavioral time-domain simulation results are reported along with the statistical results to verify the accuracy of the model. Second, the impact on burst errors and FEC failures is analyzed by considering various noise sources including residual ISI, crosstalk, transmitter and receiver jitter. Specifically, the proposed model will be used to demonstrate how negative residual ISI cursors may significantly impact DFE error bursts. In addition, a novel statistical ISI analysis method is presented to incorporate transmitter and receiver jitter into the post-FEC BER estimation. The approach can accurately estimate the data-dependent ISI distribution through jittered half-UI pulses that are derived from the standard unit-pulse response. This procedure allows efficient computation of ISI probability density function in the presence of arbitrary uncorrelated jitter distributions. Lastly, a systematic methodology to perform post-FEC BER estimation based on measured unit-pulse response is demonstrated using a 4-PAM 60Gb/s wireline transceiver fabricated in 7nm FinFET technology. The methodology can accurately predict very low post-FEC BERs that are difficult to be measured in real-time (<10^-12). In the paper, we will present a set of measured results showing the accuracy of the proposed statistical model subject to various channels, noises, and equalizer settings.
When the data rate goes beyond 112Gbps, the FR4 based PCB transmission system is hitting its performance limit . Industry needs new techniques to continue the success story of FR4 (here FR4 refers to any glass-reinforced expoxy laminate material, including the ultra-low-loss materials such as Megtron 7, DS-7409D, etc), as PAM4 and FEC once did . Chord signaling is an innovative coding scheme which is highly promising to continue the life of FR4 . Currently, there are two Chord signaling systems that have been accepted by OIF CEI and JEDEC., i.e., ENRZ (Ensemble NRZ) and CNRZ (Chord NRZ), each aiming for long reach and short reach channels, respectively . Till now, there is little published information that gives detailed characterization on Chord signaling systems. As the first part of a series of papers, this paper focuses on the impact of Return Loss on the performance of ENRZ, meanwhile compared to that of NRZ and PAM4. An interesting phenomenon observed in practical high speed systems was described. A channel with much lower level of RL level yet having some resonating behaviors at the frequency range below 10GHz, surprisingly produces worse eye than another channel which has much higher overall RL level but little resonating behavior. This phenomenon is outside of the current specification methodologies since in most high-speed interface standards only the overall level of RL is specified as the pass/fail criterion, if looking at the RL limit only. However, this phenomenon actually aligns with the COM results, i.e., the first channel also yields worse COM value compared to the second channel. The paper investigates the underlying reasons that caused such phenomenon, then moved further to reveal why COM result reflects the resonating nature of RL. To assist an efficient analysis, two relatively new metrics IMR (Integrated Multiple Reflections) and IRL (Integrated Return Loss), which were introduced in USB-Type-C v1 standard in 2014, together with an author-invented metric called Integrated Return Loss Deviation, were applied and their respective effectiveness in terms of revealing channel RL issues were analyzed and compared. A quick analysis procedure without requiring computing COM was proposed. Then, the proposed analysis flow was applied on other two critical RL associated aspects, aka, the tolerance to the non-uniform RL on each wire of the chord signaling bundle, and the tolerance to common mode RL, both overall RL level and the deviated RL levels on individual wire within a chord bundle. Some interesting and useful observations were discovered and analyzed. In the end, a set of guidelines for designing optimal chord signaling channel were given.
PAM4 signaling is used in high speed 26GBaud and 53GBaud electrical and optical communication systems. For design validation and production, the high bandwidth equivalent time sampling oscilloscopes and the real-time sampling oscilloscopes are used to analyze and measure the optical signals. Correlation is required between design simulation, lab measurement, and the actual device performance. The differences between instruments need to be considered. The real-time oscilloscopes have the real-time sample rate high enough to prevent aliasing according to the sampling theorem. For example, a 70GHz analog bandwidth real-time oscilloscope has the real-time sample rate of 200GS/s. The equivalent-time sampling oscilloscopes have lower instrument noise floor, hence higher sensitivity that make them capable of measuring lower power electrical and optical signals. The equivalent-time sampling oscilloscopes have much lower real-time sample rates than the real-time oscilloscopes. For example, a 70GHz analog bandwidth equivalent-time sampling oscilloscope has the real-time sample rate up to 300kS/s. Even though the equivalent-time oscilloscope can reconstruct the repeating signal's data pattern, it cannot completely prevent aliasing from happening because of the insufficient real-time sample rate. When some of the frequency components in signals are aliased to different frequencies, the downstream digital signal processing and measurement results will be affected. It is a common practice to apply the digital signal processing to compensate the sampling oscilloscope channel response to comply with the desired response such as an 4th-order Bessel Thomson filter for optical signal measurements. The other example of the digital signal processing is the equalizers, such as FFE. The equalizers are simulated in the oscilloscope software to mimic the equalizers in the receivers. In this paper, the equivalent-time sampling scheme is reviewed and aliasing effect is analyzed in theory. An numerical example is given where an optical system having the pilot tune is investigated. It is demonstrated that the low frequency pilot tune around 1KHz can be aliased to multi-GHz on the equivalent-time sampling scopes; and it can be aliased to different frequencies with different oscilloscope horizontal settings such as samples per unit interval. The pilot tune is not aliased on the real-time scopes and is mostly attenuated by applying a high pass filter or by reducing the acquisition time window. When the acquisition time window is reduced on the real-time scopes, the pilot tune has impact mainly only on waveform offset. A method is described that removes the unwanted aliased components on the equivalent time oscilloscopes, the improved correlation on TDECQ measurement between the equivalent time oscilloscope and the real-time oscilloscope is demonstrated in the numerical example.
The paper will cover the basics of optical transceiver design. With the basic block diagram the paper will cover the device selection criterion for CDR, TIA and Driver. Specification of EML, SiPH modulator, PIN will be discussed. Taking all specifications into account, entire end to end electro-optic link simulation will be presented. Host and Module compliance board specifications will be discussed. Stack-up selection, and high speed routing criteria will be presented. The performance matrix for TDECQ, ER, SRS (Stressed RX Sensitivity) will be discussed from simulation and measurement prospective.
The 5th generation of mobile communication will give rise to opportunities for higher QoS high-bandwidth applications. It will add new low-latency applications and services as well as massive connections of industrial and commercial devices. As a consequence, a new higher performance computing and wireline network infrastructure is needed to enable 5G wireless communication and the associated services. Research, development and manufacturing need to be done on a tight timeline and a competitive cost envelope without compromising on quality. This presentation will cover the state of the industry and future trends all the way from high-speed computing, 400G/800G data center networking to terabit metro/long-distance networks and the 5G wireline fronthaul infrastructure.
Advent of fourth industrial revolution based on artificial intelligent, cloud-computing, and big-data, terabyte/s (TB/s) bandwidth is needed to support these technologies. Therefore, data rate of DRAM and the number of parallel buffers are increasing to support TB/s bandwidth. To achieve TB/s bandwidth, maintaining signal integrity in the high-speed channel is mandatory. In the high-speed channels, not only channel parameters but also non-linear power/ground noise associated with SSO, DVI and burst affect signal integrity. These noises are dominated by the occurrence probabilities of the SSO buffer combinations, anti-resonance peak of the PDN, long data pattern which can be affected by the PDN anti-resonance peak and data-coding. However, conventional transient simulators and eye-diagram estimation methods fail to accurately estimate the eye-diagram considering these noise effects. Recent trends also require signal/power integrity analysis in extremely low bit error rate (BER) level. Therefore, exploring the impacts of these noise factors on signal integrity is becoming more crucial. Also, high-bandwidth memory (HBM) which is used for AI applications and data center servers consists of 1024 buffers sharing the same hierarchical PDN which is vulnerable to these noise factors. To consider these noise factors. In this manuscript, we propose fast and accurate statistical method which considers non-linear power/ground noise associated with SSO, DVI data coding and burst noise generated by long data pattern affected by the PDN anti-resonance. The proposed method is applied for signal/power integrity analysis in HBM interposer channel. Proposed method estimates impacts of these noise factors on HBM channel. Based on proposed the method, hierarchical PDN design is optimized.
Firstly, we propose and validate statistical method which estimates statistical eye-diagram in the high-speed digital channel considering power/ground noise generated by SSO, DVI-coding, and burst noise. We derive four different output responses: pull-up/down and steady states. These step-responses are affected by aggressor SSO combination and DVI-coding. We can directly derive these responses using transient simulators or based on proposed analytical model which can be written in simultaneous differential equations. We also propose formula which derives occurrence probability of each response in function of aggressor buffer states considering DVI coding. By mapping occurrence probabilities to the output response-sets, we can derive PDF of each responses-set. By defining main-cursors/ISI PDFs and taking convolution between main-cursor PDF and ISI PDF which affects each other, we can derive statistical eye-diagram considering impacts of SSO noise and DVI coding. Lastly, in case of burst noise associated with long data pattern and anti-resonance peak of the PDN, we first define data pattern length corresponds to the anti-resonance frequency and generate additional responses similar to MER method and calculate occurrence probability. The proposed method is validated using HSPICE and IBIS models.
Using the proposed method, signal/power integrity analysis in the HBM channel is conducted. Actual hierarchical PDN is considered based on previous works. Using the proposed method, statistical eye-diagrams are estimated with altering PDN design and bit pattern-length. Then, hierarchical PDN design is optimized with proper decoupling capacitors in the hierarchical PDN which can minimize large noise associated with burst.
As the demand for system bandwidth increment continues to grow, industry standards have begun to develop specific 112G serial electrical interconnect specifications. Even so system design for 112G faces bigger challenges than 56G, there is always a need to evaluate the scalability of the system and the technical feasibility for the 'next generation', that is, serial link system with speeds exceeding 112Gbps.
Out of the requirement from both the system vendors and end users 'Don't want to have to change anything' for each new generation of technology, and the continuity requirement for various forms of chassis/box systems, PAM based modulation schemes (PAM4, 8, 16 and DSQ) are analyzed for beyond 112G copper transmission, over PCB backplane, orthogonal backplane, and cable based backplane architectures. Salz SNR method is used for analyzing the 'optimal' modulation scheme and its performance upper limits, embodied by ICN curve style of 'mask'.
Engineering penalty considered in the upper limit analysis is derived through correlating the Salz SNR results with the 802.3ck critical channels which just passed the 3dB COM threshold. End-to-end channel models for performance upper limit analysis are built from the state-of-the-art component technologies.
In the above analysis, components (PCB material, connector, cable, package) technologies from today and in 3~5 years' time are studied. The system forms are grouped into two categories: the smaller chassis and box systems, and the traditional large chassis. One problem raised here is that package loss of big SoCs is found out to be weighting too much even in cable based links with improved PCB features in smaller chassis and boxes, and becomes one major factor limiting the system forms.
In the next, a top-down design specification decomposition and allocation method for each individual link component is introduced. This method aims at early system evaluation phase and is well suited for PAM based system analysis. Unlike the commonly practiced bottom-up approaches for detailed performance verification, it is essentially expected to be efficient for defining the individual component design specifications, including IL and ICN which matter the most in first round system evaluations. A case study is presented to show how the method is conducted for getting link component design specifications for the two categories of systems, from the above analyzed end-to-end system performance 'mask'.
Two sets of component design specifications of each key media components (connector, cable, PCB material, copper roughness, package) are derived for the two categories of systems, respectively. These preliminary component specifications are provided for industry reference for further explorations.
The authors also know that some of the assumptions in this study may not be established in the future, and there are also great uncertainties whether the component manufacturers are able to achieve some of the guidelines. The intention of the work is to make an attempt to go beyond 112G in copper systems, to start analyzing in response to some common questions from component suppliers, and to share our preliminary analysis findings with the industry to stimulate discussion and iteration regarding the topic.
Power delivery network (PDN) noise and power supply-induced jitter (PSIJ) are increasingly becoming the critical limiters to the performance and cost of field-programmable gate-arrays (FPGAs). For serializer/deserializer (SERDES) operating at tens-of-Gbps, PSIJ is one of the major jitter components, and if improperly controlled, PSIJ can cause link failure, especially with power management events. This is extremely challenging and crucial for high-end FPGAs which can integrate as many as 128 high-speed links in a product with each link programmable to support various data-rates and protocols per user requirements. An accurate PSIJ pre-silicon simulation flow that correlates with silicon is necessary to ensure robust system operation with cost-effective solution.
The paper presents a comprehensive and accurate PSIJ pre-silicon simulation methodology for high-speed SERDES, and the correlation with laboratory measurements. The simulation flow includes several steps: system-level PDN modeling, definition of customer usage scenarios and associated power events, current profile simulation for these events, PDN noise simulation, PSIJ sensitivity analysis of circuit blocks, and link-level margin analysis with self-generated and coupled supply noise.
First, an accurate on-die extraction technique is utilized to build the on-chip PDN models. For each functional block, the frequency-dependent distributed models for the device cap together with interconnect cap, and the MiMcap together with power/ground grid cap, are extracted separately. The models have lower metal-layer ports to probe transistor-level noise and connect with current excitation. Then, the current excitation is distributed to take into account hotspots in the design and maintain the real physical distance from/to the decaps.
To simulate PDN noise precisely, the current changes for complicated power state transitions are simulated with complex control signal settings in post-layout circuit schematics. In addition, various lanes' operations and the sequencing of power transition events inside a lane and among lanes are considered in the PDN noise simulations.
Circuit supply-noise sensitivity profiles are extracted from transistor-level circuit simulations for sensitive blocks for the entire clock and high-speed signal paths. The sensitivity profiles along with the PDN noise for different power domains and power management events are included in link-level models to simulate transmitter jitter impact and receiver jitter tolerance (JTOL) impact. The results, in turn, guide circuit optimization for better noise immunity and decap allocation for better PDN profiles.
In post-silicon measurement, BERTscope is used to measure the FPGA transmitter jitters and also sends data stream to the FPGA receiver for far-end loopback. For three charge pump settings, the transmitter jitters for a victim lane operating at tens-of-Gbps are measured with and without power state transitions of several aggressor lanes, so that the victim lane's PSIJs caused by the PDN noise coupling from the aggressors are obtained. The phase noise profiles of a lane are also measured with and without the noise coupling from another aggressor lane. The receiver JTOL data are measured as well.
Good PSIJ correlations are finally achieved between simulations and measurements. This validates the comprehensive PSIJ methodology for tens-of-GHz transceivers. The good correlation in phase noise caused by noise coupling further validates the on-chip extraction approach.
Recently there has been interest in adoption of millimeter-wave technology for high-speed data transport to complement existing means, such as optical and metal-based interconnects, offering substantial advantages in bandwidth, reach, power consumption, and cost. Optical transport is ideally suited for longer reaches, where power consumption due to electro-optic conversion is justified. Similarly, metal-based interconnects, such as coaxial cable or PCB traces, are ideally suited for shorter reaches. Millimeter-wave data transport based on low-cost plastic fibers fills an important gap between optical and metal-based transport, providing bandwidth and reach superior to metal with less power consumption than optical.
We present a millimeter-wave helix surface-mount antenna for high-speed data transport over low-cost plastic fiber. Guided millimeter-wave technology enables gigabit transport in the centimeter to meters range, to complement existing transport technologies based on optical fibers, flyover assemblies, and PCB traces. Specifically, we present the electromagnetic design process of a millimeter-wave helix surface-mount antenna optimized for launch efficiency into a plastic fiber used in a 10 Gig-E interconnect. We plan on a 10 Gig-E live demonstration of our high-speed millimeter-wave transport system.
With the advancements in electro-optical communication systems to meet the ever-increasing demands for higher data speeds, coupled with the market moving toward lowering the effective cost per bit per mile, component designers are struggling to keep costs low and still maintain better designs. Optical compon The characterization of opto-electronic (O/E) and electro-optical (E/O) components for photonic high-speed data transmission is critical as these components not only form the building block of the communication systems, but also define the future of high-speed data in various domains of communication. Measuring the responsivity of components (such as PIN diodes, APDs, electro-absorption modulators, and modulated lasers) is crucial in today’s world. On-wafer, electro-optical ROSA/TOSA device characterization will also be very crucial for the next-generation networks to work seamlessly. ent analyzers play a significant role in the testing and debugging of the key components utilized in these systems
Test instrumentation is an integral part of the move to 112 Gbps/lane designs. The proposed reference receivers being investigated for these high speed SERDES are making a tradeoff between the performance gains of high tap count DFE’s compared to less complex FFE based equalization methods. While these committees have not converged on a definitive equalization strategy yet, the T&M contributions are supporting the combinations of FFE’s, CTLE’s and DFE. In most cases these filters operate orthogonally enough where they can be optimized sequentially to specific optimization targets such as eye height, DDJ minimization or pulse response optimization. This presentation will review efforts in test tools that are closely tracking standards work and describe optimization methods ensuring test instrumentation tracks the reference receivers with high accuracy and repeatability across different classes to tools.
Optical and electrical links are not expected to have raw, error-free performance. Using PAM4 modulation to reach 400 Gb/s speeds means that engineers must now design, develop, and validate transceivers and network communication devices that have multiple 28 or 56 Gb/s channels. With 4-level signaling and reduced signal-to-noise ratio (SNR), PAM4 links require forward error correction (FEC).
Performance simulation of high speed serial link using behavioral transceiver models has become indispensable for channels design. On-die termination model has been one of the essential building blocks, and it has been evolving with increasing data rate. The latest discussion is 100~116Gbps PAM4. While IBIS-AMI and COM are the most popular simulation methods specified by industry standards, their approach for on-die termination modeling is slightly different because of their different priorities.
Section 1: Link simulation, Transceiver Models and Role of On-Die Termination Model
1.1) Eye diagram and BER are the outcome and the figure of merit of link simulation using IBIS-AMI. Transceiver IBIS-AMI model is device specific, and simulation user provides their channel S-parameter.
1.2) COM's figure of merit is Signal-to-Noise Ratio. Transceiver models are the part of IEEE802.3/CEI standards because COM is used for channel compliance. Channel designer must provide their channel S-parameter.
Section 2: On-Die Termination Technology
2.1) No termination at low data rates. When data rates increased and rise/fall time (tr/tf) became shorter than the path length between TX and RX, termination was required, and implemented by resistors on the PCB close to TX/RX devices. When data rates further increased and tr/tf became shorter than the path length between TX/RX devices and on-PCB termination resistors, these resistors were implemented on the dies.
2.2) When data rates further increased, reflection noise and bandwidth reduction by the parasitic capacitance around on-die termination became problematic. To alleviate it, on-die inductor to compensate it was introduced.
2.3) When data rates further increased, compensation by simple inductor became insufficient, and bridged T-coil was introduced. Bridged T-coil surrounds parasitic capacitance by two inductors being capacitively coupled at each end. Parasitic capacitance effect can be significantly reduced though it's hard to completely nullify it.
Section 3: Evolution of On-Die Termination Models
3.1) IBIS-AMI model is provided by device supplier, and its accuracy is high priority. The 1st generation model is parallel resistance and capacitance (Rd//Cd). Cd is "effective", which may be reduced by compensation inductor. While Cd increases RX input tr/tf, it must not affect TX output tr/tf, which is modeled separately. In the 2nd generation, both return loss and insertion loss are modeled by S-parameter, whose frequency response may be complicated due to matching circuit. TX die-model includes edge shaping filter and return loss, and RX die-model includes input loading and return loss.
3.2) Since COM's TX/RX model represents reference design instead of specific implementation, "simple but not simpler" is high priority. The 1st generation model is the same as IBIS-AMI's 1st generation model. For higher data rate standards, Cd value was reduced reflecting more advanced technology. For the 2nd generation model currently discussed, Cb-Ls-(Cd//Rd) topology is likely adopted (Cb effective bump capacitance, Ls effective compensation inductor). For the future 3rd generation, two factors must be considered. One is more complicated frequency response due to more elaborated matching circuit. Another is re-partitioning between TX edge shaping filter / RX input buffer filter and die termination model.
Section 4: Summary and Conclusion
The USB4 Specification (recently released by the USB-IF) introduces next generation features and performance for USB hosts, devices, and hubs. Using the existing USB-C™ Cables and Physical Layer Specification from Thunderbolt 3; USB4 designs will rapidly find their way into new products in 2020. USB4 acheives up to 40Gbs (2 lanes at 20Gb/s) electrical throughput, supports USB Power Delivery and DisplayPort all over a small form factor flippable connector. While the USB4 interface is designed to be easy for the end user, its architecture presents new challenges for high speed digital design engineers.
This session will be a high level overview of USB4 Electrical and Protocol Compliance Testing and Debug using Teledyne LeCroy Oscilloscopes, Protocol Analyzers, and partner solutions.
With the robust list of new features, DDR5 SDRAM pushes the limits of high-speed signaling and tackles the memory bandwidth challenge caused by the exponential growth in data generated by cloud computing, IoT and real-time data analytics, and addresses the need of data centers to continuously store, transfer and process that data faster. DDR5 brings in unique test challenges which were never before seen in the memory world. This presentation will provide an update on the DDR5 Rx/Tx compliance test and provide insight in the latest characterization and debug techniques to enable analysis of the highest DDR5 speed grades.
Nowadays, the high-speed SerDes on 2.5D IC integration design is extensively applied on applications such as networking and artificial intelligence for data center communications, and the whole channel of high-speed SerDes (up to 112Gbps) has become highly valued for signal integrity performance, especially, modeling methodology is the fundamental and key point. In this paper, the 2.5D channel SI effect including impedance mismatch due to increased C-load effect as well as cross coupling effect from very closed bump structure were evaluated to find out if any SI performance impact. Therefore, 3D-FEM channel model extractor plays a critical role in this analysis. Upon this basis, the extraction of the wafer-level signal path was proposed and also took whole channel of full wave extraction into consideration. The results indicated some impacts by these effects not only in time domain performance, but also in frequency domain of each SerDes block. Last but not the least, the model extraction of computation resources and cost also are shown.
It has taken multiple years to shape DDR5 memory definitions and specifications by the industry. Some semiconductor manufacturers have already started powering on and validating DDR5 memory although the final specification is not yet complete. DDR5 memory shares many similar features to its predecessor, DDR4; but also some features that are very different, such as equalization on the DRAM receiver. Without simulating proper channel equalization, it is almost impossible to open the eye at higher speed such as 6400Mbps, and predict what the design margin will be. In this paper, you will learn the details of DDR5 memory design, which models and simulation technologies are needed for an accurate simulation, and how design/simulation challenges are addressed with Memory Designer in PathWave ADS.
DDR4 requires tight specifications for high-speed operation and channel modeling requires high accuracy since design could operate close to the specification limits.The JEDEC standard defines the maximum speed for DDR4 as 3200 mega-transfers per second (MT/s), although the first DDR4 DIMMs just become available at those speeds. As PCBs continue to become more complex with higher densities, this is driving an increase in the number of layers in a PCB stackup used to ensure all signals in the design are routed effectively. Thick PCB vias with long stubs create unwanted resonances in the channel, if these resonances occur near the Nyquist frequency of the bit rate, they can devastate the eye opening at the receiver. The question is how big is the parasitic capacitance from this vias and will it have an adverse effect on the switching edge? The scope of this effort applies to understanding the impact of via stubs to the impedance of the signal lines on DDR4 memory. The goal of this research effort is to present ideas intended to see how far we can push a complex design using DDR4 with long via stubs in the channel. The intent of this effort is to prove with SI simulation at what via length DDR4 link failures will occur and correlate those results (with link failures) to the resonant frequency of the via created by the stub.
As we continue to push for higher performance by running at higher clock frequency, engineers are required to complete block design with less timing margin. Even with the help of more advanced technology, it is getting more challenging to close timing with aggressive speed target. In synchronous digital block design, one component in the timing equation is jitter, and more specifically period jitter. As clock period gets smaller, jitter component is becoming a more important part of the equation. In the past, one might assign some number large enough to cover jitter component for timing budget in order to guarantee timing. However, as we are trying to hit more aggressive speed target, it is important to understand jitter behavior with different physical parameters and design specs, which will then enable us to optimize jitter as well as to characterize and model jitter in more accurate manner in order to hit speed target.
In this paper we propose to describe and define what period jitter is and its relationship to TIE (time-interval error), and why it is an important component to the timing equation. We will also explain that the source of this jitter on silicon die is from power supply noise (power supply induced jitter, PSIJ) rather than inductive or capacitive coupling, assuming the clock routes are well shielded.
We will next talk about the different factors that can affect jitter and its behavior which is the main part of this paper. First factor is frequency component of the noise and its relationship to the victim clock frequency. We will explain why period jitter is sensitive to the frequency component of noise source and its interesting behavior at its harmonics and half victim clock frequency. Second factor is length of the delay path. We will explain how period jitter varies with delay path length and when saturation happens. Third factor is the victim clock frequency. We will explain how the jitter magnitude would change with clock frequency when the power supply noise consists of white noise vs when it consists of some definite frequency tones. Fourth factor is location proximity of noise source to clock route. We will explain that high frequency noise manifests itself through on-die power grid, whereas mid to low frequency noise manifests itself through package and board. Fifth factor is supply voltage. We will explain why we should see more jitter as supply goes down.
We will also present various lab measurement through Xilinx FPGA to support our explanation of the factors above. We will explain the lab setup, clock routes and noise sources used in our lab measurements.
Lower operating voltages and higher currents are making power supply margining an increasingly critical part of product development. A complete assessment of the power supply performance must include both the voltage and current. The most important location for measuring the power supply performance is at the load, where the quality of the power supply voltage is critical. The voltage measurement at the load is usually not very difficult. If there are perturbations in the power supply voltage in response to changes in the load, it becomes important to also measure the current. Unfortunately, current measurement at the load is not possible because of the distributed nature of the current connecting to a multipin load. An alternative, albeit indirect, method for ascertaining the current at the load is possible by using a method like de-embedding used in high speed signal measurement. If we can measure the voltage and current waveforms at the input to the printed circuit board, PCB, power delivery network, PDN, and measure the output voltage at the load and if we have an accurate model of the PCB, then we can infer the current waveform at the load.
To measure the input voltage and current we would like to take advantage of the flexibility and programmability of a benchtop power supply. Unfortunately, connecting a benchtop power supply to the DUT can generate as many problems as it solves. Even a few nano-henries of inductance between the power supply output and the load can result in catastrophic results. In fact, the connection between the benchtop power supply and the DUT must be designed with the same power integrity as all the other parts of the power supply distribution network, PDN.
As with any attempt to make measurements, we must minimize the impact of the measurement probe on the DUT. For this reason, we will also consider the probe connection as a component of the PDN. One of the most difficult measurements to make is that of dynamic current. Whether we are attempting to measure the dynamic current at the input of the power distribution network of the PCB or at the load, we cannot allow the current measurement probe to affect the stability of the PDN.
The paper will apply the same PDN design technics used in the previously presented work(1) to the benchtop power supply connecting cable and to the current measurement probe connections to achieve a test system that is flexible and has minimum impact on the DUT.
The focus of the work presented is modeling and measuring the impedance of the PDN components including the power supply connections and the oscilloscope connections and designing a compensation that results in a flat impedance from the test set up. Then the resulting measurements of input voltage and current to the PCB PDN and the voltage at the load applied to the PCB models will generate a load current waveform.
With increasing IO counts to support higher bandwidth, organic packages are trending to larger body sizes approaching the realm of LGA pin fields which adds additional complexity with socket performance and mating force requirements. Major contributor to pin count is the significant number of ground pins which are typically used around high speed differential pairs for crosstalk isolation inside the package, and on the PCB via field which mirrors the component pin assignment. It is now becoming common practice to use power pins to serve a dual purpose, that is to support crosstalk isolation between high speed signals and to provide power delivery to Serdes IO, helping to reduce overall pin count and subsequently limit package body size and stay within a BGA form factor. This however has other consequences, creating resonances inside the PCB via field on any signal IO and elevated crosstalk contribution to the same signal pair from other adjacent signal pairs. Using 3D extraction and analysis a physical explanation for the resonant behavior and prediction models for frequency and amplitude of resonance and crosstalk noise is proposed. Various methods to effectively mitigate the impact of resonance on crosstalk are also proposed.
The R&S®RTP high-performance oscilloscope combines high-class signal integrity with a fast acquisition rate. Customized frontend ASICs and realtime processing hardware enable highly accurate measurements with unprecedented speed in a compact form factor.
Quickly find signal faults with 750,000 waveforms/s
High-precision digital trigger without bandwidth limitations
Realtime deembedding for triggering and fast acquisition
Compact design and silent operation for best fit to any lab
Precise measurements due to flat frequency response of +/- 0.25 dB
Come to the Power Integrity station in Tektronix booth 519 and see us demonstrate: • Accurately characterizing power rails: AC ripple, DC drift, and high frequency noise. • Measuring mOhm Power Distribution Network (PDN) impedance using the Tektronix 5 & 6 Series Mixed Signal Oscilloscope (10Hz to 500MHz). • Finding power supply and high speed clock/transceiver sensitivities using eye diagrams & spectral plots. • 8GHz TDR with the Ultra low noise Tektronix 6 Series Oscilloscope & Picotest PerfectPulse.
High speed/frequency RF transceivers and SoCs for 5G, mobile, AI, automotive and networking applications are becoming increasingly susceptible to electromagnetic (EM) cross coupling effects. In this session we will cover how ANSYS on-chip EM solutions can accurately capture all electromagnetic phenomena for mitigating the risk of electromagnetic crosstalk induced performance degradation and failure in high-speed, high frequency and low-power RFICs and SoCs.
Miniaturization and higher speeds are driving changes in our electronic devices. The power requirements become stringent, whereas components ball pitch is minimized. These trends create new challenges for Power Delivery Network (PDN). For the PDN in the PCB, thin core materials, below 0.5 mils, have significant advantages of high capacitance and low inductance. Past studies have shown that thinner is better for power delivery, and there is evidence that there can be a significant improvement using a laminate with a 0.3 mil (8 micron) dielectric. This ultra-thin laminate type is often referred to as an embedded capacitance material. Since most of the traditional PCB laminates using glass cloth and resin cannot be made effectively this thin, the embedded capacitance laminates considered for this study will be a polymer film bonded to copper with a high Dk filled epoxy composite resin. For PCB layers with PDN, this ultra-thin material will replace the existing traditional homogeneous material in the PCB stack-up and it will be manufactured as a hybrid stack-up. Since PDN requirements have to be addressed with less spacing under the via holes for decoupling caps, the ultra-thin laminate embedded capacitance solution should be considered. By scanning the market for ultra-thin laminate materials, it is clear that there is a growing list of products with dielectrics that are considered ultra-thin and can improve the PDN such as the laminates near 0.3 mil thickness. The traditional PDN design can then be improved just by replacing regular cores with the ultra-thin ones without completely redesigning the board, which is a fast and relatively cost-effective solution. The topic was suggested in the literature previously (1). In this work we have identified PDN imperfection in an already designed system, the imperfection has been evaluated and quantified in simulations and measurements in terms of s-parameters. The stackup modification using embedded capacitance layers was selected as the simplest and fast solution. Traditional laminates were replaced by ultra-thin ones with a thickness under 0.4 mil and a high Dk. During our research, we found out that manufacturing using the ultra-thin materials that are 0.5 mil (12 micron) or less is not so common and requires some special attention. This work covers the solution space for novel types of design constraints, ultra-thin laminate material types in the market and their implementation challenges through the path to successfully manufacturing the board with improved PDN.
In general, TEST SOCKET manufactures products by designing only the mechanical parts according to the POD of the package. TSE's Rubber TEST Socket (ELTUNE-TM) is designed to be closest to 50 Ohm impedance according to signal integrity simulation results depending on the application of the package. In other words, it is a high performance product specialized for high speed.
Differential via is commonly seen in multilayer PCB. With the data rate increasing to 56G and beyond, it becomes a critical impedance discontinuity in high-speed channels which could significantly degrade the signal. Therefore, accurate modeling of differential vias becomes critical for high-speed designs. It is challenging to characterize the differential vias from both simulation and measurement. Achieving good correlation up to 50GHz or higher frequencies is even more challenging.
The paper presents a complete flow covering simulation, measurement with de-embedding, and their correlation. Numerical simulation can not only simulate the transmission and reflection of electromagnetic waves in a complicated environment but also provide physical insight. It becomes an indispensable tool to analyze differential via in high-speed PCB designs. Different full-wave electromagnetic solver methods including finite element method (FEM) and method of moments (MoM) are reviewed. Their pros and cons for differential via simulation are also reviewed. A novel magnetic current based MoM is proposed. Instead of using electric current on a large ground plane, this method uses magnetic current on much smaller antipad area, which leads to a significant saving in the number of mesh elements for EM simulation. The further benefit brought by the magnetic current method is that it enables layer-by-layer domain decomposition. With that, the entire problem is broken into multiple layered problems, which can be computed in parallel and then cascaded to form the solution. This leads to tremendous computation time saving. The benchmark results show a significant speedup while maintaining the same accuracy as 3D FEM solvers
The board with differential via as a device-under-test (DUT) together with proper test fixtures are manufactured. Test fixture design and its corresponding de-embedding technique is another topic to be investigated. The measurement techniques with various de-embedding methods will be reviewed, including smart fixture de-embedding (SFD) from Missouri S&T and auto fixture removal (AFR) from Keysight. These methodologies have the different theory behind them and have their own advantage and drawback. With that, a novel 2x-through based de-embedding technique is proposed, which demonstrates good correlation with reference tools. The benchmark on the cases in IEEE P370 also demonstrates the accuracy of the method. Applying this de-embedding method to the differential via cases, we are able to extract the S-parameter for differential vias.
Lastly, to establish a good simulation-measurement correlation from DC to high frequencies, we have to ensure that PCB material properties used in the simulation are correct. Dk/Df and surface roughness extraction are developed based on the measurement on transmission line structures with multiple lengths. Using these material properties in simulation, we can establish excellent simulation-measurement correlation.
A software package including (a) via model creating and simulation with the new magnetic current based MoM solver, (b) 2x-through de-embedding with measured data, and (c) simulation and measurement correlation will be part of the deliverables of the paper. It provides a complete flow for via simulation and measurement validation with intuitive GUI.
The Electrical Validation (EV) of server silicon products requires automated testing of hundreds of high-speed I/O lanes. To achieve that goal, all the I/O lanes are routed out from the package pin field to cable connectors for compliance testing with oscilloscopes and Bit Error Rate Testers. Because of large 4000+ square millimeter server package sizes, the minimum board routing length from package pin to the connector could exceed five inches in a cost-effective EV board design. To automate the measurement process, RF switches along with necessary cables are often used. DC blocking capacitors are also required. Even the minimally required EV setup can make the signal path from Tx silicon diepad to the oscilloscope input experience 15 dB or more loss at Nyquist frequency of 16 GHz for 32 Gbps signaling with ultra-low-loss PCB materials and high quality, low-loss cables and connectors. The S-parameter based de-embedding of the channel from the package pin to the scope input is done by applying a gain filter function to compensate the Inter-Symbol-Interference (ISI) impact of the frequency-dependent channel loss when a non-clock compliance pattern that partially mimics the normal Tx traffic is used. But the gain filter amplifies the broadband noise of the scope leading to significant over-estimation of the Tx random jitter. Depending on the scope noise at various settings and choice of the bandwidth of the gain filter, the overestimation of jitter can lead to >200% error. In this paper, we first provide an intuitive picture of how the desired channel ISI correction and the undesired amplification of broadband scope noise work against each other in jitter measurement methodology based on S-parameter de-embedding. The amplification of the scope noise and its impact on jitter overestimation depend on the choice of the filter bandwidth. We then describe a detailed methodology of measuring Tx jitter accurately by making an optimal choice of Tx equalization and a CTLE-based equalization along with appropriate choice of scope settings and measurement setup to minimize noise and ISI impact. We perform a series of experiments that enable us to measure Tx jitter of a number of devices with various amounts of channel losses. We used clock pattern, PRBS patterns, and PCIe compliance patterns. We analyzed jitter by applying conventional de-embedding methodology as well as our proposed methodology. For 16 and 32 Gbps, we used the CTLE curves defined in the PCIe specification. Our results can be summarized as follows: (1) random jitter is most sensitive to channel loss, signal pattern, and scope noise floor; (2) the channel loss between package pin and scope impacts the measurement accuracy most; (3) random jitter with clock pattern and minimum channel loss provided a measure of minimum Tx random jitter as baseline; (4) the sweeping of PCIe Tx equalization presets and CTLE curves to obtain minimum uncorrelated pulse width jitter provided an optimal combination of Tx and CTLE equalization; and (6) the proposed methodology minimized the 12 dB channel loss impact on 32 Gbps random jitter within a few tens of femtoseconds.
High-speed electrical design faces technical challenges at data rates of 112Gbps and beyond. Even with the most advanced printed circuit board (PCB) or cable technology, the insertion loss becomes too high to be conquered. IEEE 802.3ck defines the insertion loss target as 28dB at the Nyquist frequency with 3dB channel operational margin (COM), with package considered the bump to bump insertion loss will be as high as 36~38dB. To achieve this target ADC-DSP based receiver is a promising candidate to achieve this target. The typical ADC-DSP based receiver is "m-tap FFE + 1-tap DFE", compared to the mixed-signal solution (n-tap DFE) it can provide ~1dB COM improvement. The improvement mainly comes from the pre-cursor cancellation . Even with the ADC-DSP based receiver the overall performance for the most difficult channel is still marginal. To address this margin issue, advanced DSP and FEC algorithms can provide technical path to provide performance enhancement. The performance of advanced DSP technologies will be investigated, including the partial response (PR, also known as duo binary PAM4) receiver, End of burst detection (EoBD, also known as precoding2.0 ). With all the technical combinations, 5 DSP schemes will be investigated and compared in terms of the raw BER or pre-FEC BER performance. The EoBD technology can be applied to (1+D) DFE receiver to improve the post-FEC BER performance. Two baud rate 53.124GBd and 56.25GBd are simulated corresponding to standard RS(544, 514) FEC and a stronger 12% FEC, e.g. RS(576, 514). 1. m-tap FFE + (1+αD) DFE receiver (benchmark) 2. m-tap FFE + (1+D) Partial response (PR) receiver. 3. m-tap FFE + (1+D) DFE receiver. 4. m-tap FFE + (1+D) PR + MLSE receiver. 5. m-tap FFE + (1+D) DFE + MLSE receiver. The simulation results will show that the (1+D) filtering based solutions, i.e. PR and (1+D) DFE shows similar performance and shows no performance degradation compared with the (1+αD) DFE receiver. The MLSE receivers can improve the raw BER (pre-FEC BER) by one order of magnitude compared with DFE and PR receivers. The joint analysis of DSP and FEC schemes will be discussed. The SNR versus BER curves of different channels with different DSP schemes are obtained by Monte-Carlo simulation, hence the penalty of error propagation and the gain due to the MLSE are considered. The raw BER requirement of different FEC and DSP algorithms are obtained from these curves. By using the raw BER requirements for different FEC & DSP schemes and the corresponding raw BER versus ICN curves. The insertion loss and ICN tolerance curves for different FEC&DSP curves can be obtained and compared. Simulation results show that compared with the (1+αD) DFE receiver, (1+D) receiver may provide ~1.0dB insertion loss extension under 2mVrms ICN; EoBD may provide ~3.0dB insertion loss extension; MLSE may provide ~4.5dB insertion loss extension. RS(576, 514) FEC may provide ~1.4dB insertion loss extension under 2mVrms ICN with (1+αD) DFE receiver; however if MLSE is applied, the improvement due to the higher overhead FEC is minor. . http://www.ieee802.org/3/ck/public/18_11/lu_3ck_01_1118.pdf . http://www.ieee802.org/3/ck/public/19_03/lu_3ck_01_0319.pdf
Modern serial links use advanced techniques such as dynamic link equalization to address the challenges of transferring high-speed data over lossy channels and interconnects. These techniques involve protocol-layer negotiations between devices in order to optimize their physical-layer behavior for the signal environment.
This seminar explains how PCI Express devices implement these operations, how they are tested, and illustrates unique combinations of Teledyne LeCroy protocol- and physical-layer toolsets for powerful debug capabilities.
Recently industry and standard bodies have kicked off new projects aiming at 800GbE or even higher than 1TbE. Join this panel of experts for a lively panel discussion on what is needed for the next speed node past 112 Gb/s, say 224 Gb/s. In this panel system vendors and chip developers will discuss and share the insights on their system needs and gaps, design challenges and potential solutions for next generation high-speed networking technologies. It will provide audience an opportunity to hear and discuss new technologies such as chiplet/system-on-Chip (SOC), optical interconnects in data center, advanced modulation, signal processing, and coding.