Zobrazení pro čtení

Jsou dostupné nové články, klikněte pro obnovení stránky.

Metrology And Inspection For The Chiplet Era

Od: Gregory Haley

6. Srpen 2024 v 09:03

New developments and innovations in metrology and inspection will enable chipmakers to identify and address defects faster and with greater accuracy than ever before, all of which will be required at future process nodes and in densely-packed assemblies of chiplets.

These advances will affect both front-end and back-end processes, providing increased precision and efficiency, combined with artificial intelligence/machine learning and big data analytics. These kinds of improvements will be crucial for meeting the industry’s changing needs, enabling deeper insights and more accurate measurements at rates suitable for high-volume manufacturing. But gaps still need to be filled, and new ones are likely to show up as new nodes and processes are rolled out.

“As semiconductor devices become more complex, the demand for high-resolution, high-accuracy metrology tools increases,” says Brad Perkins, product line manager at Nordson Test & Inspection. “We need new tools and techniques that can keep up with shrinking geometries and more intricate designs.”

The shift to high-NA EUV lithography (0.55 NA EUV) at the 2nm node and beyond is expected to exacerbate stochastic variability, demanding more robust metrology solutions on the front end. Traditional critical dimension (CD) measurements alone are insufficient for the level of analysis required. Comprehensive metrics, including line-edge roughness (LER), line-width roughness (LWR), local edge-placement error (LEPE), and local CD uniformity (LCDU), alongside CD measurements, are necessary for ensuring the integrity and performance of advanced semiconductor devices. These metrics require sophisticated tools that can capture and analyze tiny variations at the nanometer scale, where even slight discrepancies can significantly impact device functionality and yield.

“Metrology is now at the forefront of yield, especially considering the current demands for DRAM and HBM,” says Hamed Sadeghian, president and CEO of Nearfield Instruments. “The next generations of HBMs are approaching a stage where hybrid bonding will be essential due to the increasing stack thickness. Hybrid bonding requires high resolutions in vertical directions to ensure all pads, and the surface height versus the dielectric, remain within nanometer-scale process windows. Consequently, the tools used must be one order of magnitude more precise.”

To address these challenges, companies are developing hybrid metrology systems that combine various measurement techniques for a comprehensive data set. Integrating scatterometry, electron microscopy, and/or atomic force microscopy allows for more thorough analysis of critical features. Moreover, AI and ML algorithms enhance the predictive capabilities of these tools, enabling process adjustments.

“Our customers who are pushing into more advanced technology nodes are desperate to understand what’s driving their yield,” says Ronald Chaffee, senior director of applications engineering at NI/Emerson Test & Measurement. “They may not know what all the issues are, but they are gathering all possible data — metrology, AEOI, and any measurable parameters — and seeking correlations.”

Traditional methods for defect detection, pattern recognition, and quality control typically used spatial pattern-recognition modules and wafer image-based algorithms to address wafer-level issues. “However, we need to advance beyond these techniques,” says Prasad Bachiraju, senior director of business development at Onto Innovation. “Our observations show that about 20% of wafers have systematic issues that can limit yield, with nearly 4% being new additions. There is a pressing need for advanced metrology for in-line monitoring to achieve zero-defect manufacturing.”

Several companies recently announced metrology innovations to provide more precise inspections, particularly for difficult-to-see areas, edge effects, and highly reflective surfaces.

Nordson unveiled its AMI SpinSAM acoustic rotary scan system. The system represents a significant departure from traditional raster scan methods, utilizing a rotational scanning approach. Rather than moving the wafer in an x,y pattern relative to a stationary lens, the wafer spins, similar to a record player. This reduces motion over the wafer and increases inspection speed, negating the need for image stitching and improving image quality.

“For years, we’d been trying to figure out this technique, and it’s gratifying to finally achieve it. It’s something we’ve always thought would be incredibly beneficial,” says Perkins. “The SpinSAM is designed primarily to enhance inspection speed and efficiency, addressing the common industry demand for more product throughput and better edge inspection capabilities.”

Meanwhile, Nearfield Instruments introduced a multi-head atomic force microscopy (AFM) system called QUADRA. It is a high-throughput, non-destructive metrology tool for HVM that features a novel multi-miniaturized AFM head architecture. Nearfield claims the parallel independent multi-head scanner can deliver a 100-fold throughput advantage versus conventional single-probe AFM tools. This architecture allows for precise measurements of high-aspect-ratio structures and complex 3D features, critical for advanced memory (3D NAND, DRAM, HBM) and logic processes.

Fig. 1: Image capture comparison of standard AFM and multi-head AFM. Source: Nearfield Instruments

In April, Onto Innovation debuted an advancement in subsurface defect inspection technology with the release of its Dragonfly G3 inspection system. The new system allows for 100% wafer inspection, targeting subsurface defects that can cause yield losses, such as micro-cracks and other hidden flaws that may lead to entire wafers breaking during subsequent processing steps. The Dragonfly G3 utilizes novel infrared (IR) technology combined with specially designed algorithms to detect these defects, which previously were undetectable in a production environment. This new capability supports HBM, advanced logic, and various specialty segments, and aims to improve final yield and cost savings by reducing scrapped wafers and die stacks.

More recently, researchers at the Paul Scherrer Institute announced a high-performance X-ray tomography technique using burst ptychography. This new method can provide non-destructive, detailed views of nanostructures as small as 4nm in materials like silicon and metals at a fast acquisition rate of 14,000 resolution elements per seconds. The tomographic back-propagation reconstruction allows imaging of samples up to ten times larger than the conventional depth of field.

There are other technologies and techniques for improving metrology in semiconductor manufacturing, as well, including wafer-level ultrasonic inspection, which involves flipping the wafer to inspect from the other side. New acoustic microscopy techniques, such as scanning acoustic microscopy (SAM) and time-of-flight acoustic microscopy (TOF-AM), enable the detection and characterization of very small defects, such as voids, delaminations, and cracks within thin films and interfaces.

“We used to look at 80 to 100 micron resist films, but with 3D integrated packaging, we’re now dealing with films that are 160 to 240 microns—very thick resist films,” says Christopher Claypool, senior application scientist at Bruker OCD. “In TSVs and microbumps, the dominant technique today is white light interferometry, which provides profile information. While it has some advantages, its throughput is slow, and it’s a focus-based technique. This limitation makes it difficult to measure TSV structures smaller than four or five microns in diameter.”

Acoustic metrology tools equipped with the newest generation of focal length transducers (FLTs) can focus acoustic waves with precision down to a few nanometers, allowing for non-destructive detailed inspection of edge defects and critical stress points. This capability is particularly useful for identifying small-scale defects that might be missed by other inspection methods.

The development and integration of smart sensors in metrology equipment is instrumental in collecting the vast amounts of data needed for precise measurement and quality control. These sensors are highly sensitive and capable of operating under various environmental conditions, ensuring consistent performance. One significant advantage of smart sensors is their ability to facilitate predictive maintenance. By continuously monitoring the health and performance of metrology equipment, these sensors can predict potential failures and schedule maintenance before significant downtime occurs. This capability enhances the reliability of the equipment, reduces maintenance costs, and improves overall operational efficiency.

Smart sensors also are being developed to integrate seamlessly with metrology systems, offering real-time data collection and analysis. These sensors can monitor various parameters throughout the manufacturing process, providing continuous feedback and enabling quick adjustments to prevent defects. Smart sensors, combined with big data platforms and advanced data analytics, allow for more efficient and accurate defect detection and classification.

Critical stress points

A persistent challenge in semiconductor metrology is the identification and inspection of defects at critical stress points, particularly at the silicon edges. For bonded wafers, it’s at the outer ring of the wafer. For chip-on-wafer packaging, it’s at the edge of the chips. These edge defects are particularly problematic because they occur at the highest stress points from the neutral axis, making them more prone to failures. As semiconductor devices continue to involve more intricate packaging techniques, such as chip-on-wafer and wafer-level packaging, the focus on edge inspection becomes even more critical.

“When defects happen in a factory, you need imaging that can detect and classify them,” says Onto’s Bachiraju. “Then you need to find the root causes of where they’re coming from, and for that you need the entire data integration and a big data platform to help with faster analysis.”

Another significant challenge in semiconductor metrology is ensuring the reliability of known good die (KGD), especially as advanced packaging techniques and chiplets become more prevalent. Ensuring that every chip/chiplet in a stacked die configuration is of high quality is essential for maintaining yield and performance, but the speed of metrology processes is a constant concern. This leads to a balancing act between thoroughness and efficiency. The industry continuously seeks to develop faster machines that can handle the increasing volume and complexity of inspections without compromising accuracy. In this race, innovations in data processing and analysis are key to achieving quicker results.

“Customers would like, generally, 100% inspection for a lot of those processes because of the known good die, but it’s cost-prohibitive because the machines just can’t run fast enough,” says Nordson’s Perkins.

Metrology and Industry 4.0

Industry 4.0 — a term introduced in Germany in 2011 for the fourth industrial revolution, and called smart manufacturing in the U.S. — emphasizes the integration of digital technologies such as the Internet of Things, artificial intelligence, and big data analytics into manufacturing processes. Unlike past revolutions driven by mechanization, electrification, and computerization, Industry 4.0 focuses on connectivity, data, and automation to enhance manufacturing capabilities and efficiency.

“The better the data integration is, the more efficient the yield ramp,” says Dieter Rathei, CEO of DR Yield. “It’s essential to integrate all available data into the system for effective monitoring and analysis.”

In semiconductor manufacturing, this shift toward Industry 4.0 is particularly transformative, driven by the increasing complexity of semiconductor devices and the demand for higher precision and yield. Traditional metrology methods, heavily reliant on manual processes and limited automation, are evolving into highly interconnected systems that enable real-time data sharing and decision-making across the entire production chain.

“There haven’t been many tools to consolidate different data types into a single platform,” says NI’s Chaffee. “Historically, yield management systems focused on testing, while FDC or process systems concentrated on the process itself, without correlating the two. As manufacturers push into the 5, 3, and 2nm spaces, they’re discovering that defect density alone isn’t the sole governing factor. Process control is also crucial. By integrating all data, even the most complex correlations that a human might miss can be identified by AI and ML. The goal is to use machine learning to detect patterns or connections that could help control and optimize the manufacturing process.”

IoT forms the backbone of Industry 4.0 by connecting various devices, sensors, and systems within the manufacturing environment. In semiconductor manufacturing, IoT enables seamless communication between metrology tools, production equipment, and factory management systems. This interconnected network facilitates real-time monitoring and control of manufacturing processes, allowing for immediate adjustments and optimization.

“You need to integrate information from various sources, including sensors, metrology tools, and test structures, to build predictive models that enhance process control and yield improvement,” says Michael Yu, vice president of advanced solutions at PDF Solutions. “This holistic approach allows you to identify patterns and correlations that were previously undetectable.”

AI and ML are pivotal in processing and analyzing the vast amounts of data generated in a smart factory. These technologies can identify patterns, predict equipment failures, and optimize process parameters with a level of precision and speed unattainable by human operators alone. In semiconductor manufacturing, AI-driven analytics enhance process control, improve yield rates, and reduce downtime. “One of the major trends we see is the integration of artificial intelligence and machine learning into metrology tools,” says Perkins. “This helps in making sense of the vast amounts of data generated and enables more accurate and efficient measurements.”

AI’s role extends further as it assists in discovering anomalies within the production process that might have gone unnoticed with traditional methods. AI algorithms integrated into metrology systems can dynamically adjust processes in real-time, ensuring that deviations are corrected before they affect the end yield. This incorporation of AI minimizes defect rates and enhances overall production quality.

“Our experience has shown that in the past 20 years, machine learning and AI algorithms have been critical for automatic data classification and die classification,” says Bachiraju. “This has significantly improved the efficiency and accuracy of our metrology tools.”

Big data analytics complements AI/ML by providing the infrastructure necessary to handle and interpret massive datasets. In semiconductor manufacturing, big data analytics enables the extraction of actionable insights from data generated by IoT devices and production systems. This capability is crucial for predictive maintenance, quality control, and continuous process improvement.

“With big data, we can identify patterns and correlations that were previously impossible to detect, leading to better process control and yield improvement,” says Perkins.

Big data analytics also helps in understanding the lifecycle of semiconductor devices from production to field deployment. By analyzing product performance data over time, manufacturers can predict potential failures and enhance product designs, increasing reliability and lifecycle management.

“In the next decade, we see a lot of opportunities for AI,” says DR Yield’s Rathei. “The foundation for these advancements is the availability of comprehensive data. AI models need extensive data for training. Once all the data is available, we can experiment with different models and ideas. The ingenuity of engineers, combined with new tools, will drive exponential progress in this field.”

Metrology gaps remain

Despite recent advancements in metrology, analytics, and AI/ML, several gaps still remain, particularly in the context of high-volume manufacturing (HVM) and next-generation devices. The U.S. Commerce Department’s CHIPS R&D Metrology Program, along with industry stakeholders, have highlighted seven “grand challenges,” areas where current metrology capabilities fall short:

Metrology for materials purity and properties: There is a critical need for new measurements and standards to ensure the purity and physical properties of materials used in semiconductor manufacturing. Current techniques lack the sensitivity and throughput required to detect particles and contaminants throughout the supply chain.

Advanced metrology for future manufacturing: Next-generation semiconductor devices, such as gate-all-around (GAA) FETs and complementary FETs (CFETs), require breakthroughs in both physical and computational metrology. Existing tools are not yet capable of providing the resolution, sensitivity, and accuracy needed to characterize the intricate features and complex structures of these devices. This includes non-destructive techniques for characterizing defects and impurities at the nanoscale.

“There is a secondary challenge with some of the equipment in metrology, which often involves sampling data from single points on a wafer, much like heat test data that only covers specific sites,” says Chaffee. “To be meaningful, we need to move beyond sampling methods and find creative ways to gather information from every wafer, integrating it into a model. This involves building a knowledge base that can help in detecting patterns and correlations, which humans alone might miss. The key is to leverage AI and machine learning to identify these correlations and make sense of them, especially as we push into the 5, 3, and 2nm spaces. This process is iterative and requires a holistic approach, encompassing various data points and correlating them to understand the physical boundaries and the impact on the final product.”

Metrology for advanced packaging: The integration of sophisticated components and novel materials in advanced packaging technologies presents significant metrology challenges. There is a need for rapid, in-situ measurements to verify interfaces, subsurface interconnects, and internal 3D structures. Current methods do not adequately address issues such as warpage, voids, substrate yield, and adhesion, which are critical for the reliability and performance of advanced packages.

Modeling and simulating semiconductor materials, designs, and components: Modeling and simulating semiconductor processes require advanced computational models and data analysis tools. Current capabilities are limited in their ability to seamlessly integrate the entire semiconductor value chain, from materials inputs to system assembly. There is a need for standards and validation tools to support digital twins and other advanced simulation techniques that can optimize process development and control.

“Predictive analytics is particularly important,” says Chaffee. “They aim to determine the probability of any given die on a wafer being the best yielding or presenting issues. By integrating various data points and running different scenarios, they can identify and understand how specific equipment combinations, sequences and processes enhance yields.”

Modeling and simulating semiconductor processes: Current capabilities are limited in their ability to seamlessly integrate the entire semiconductor value chain, from materials inputs to system assembly. There is a need for standards and validation tools to support digital twins and other advanced simulation techniques that can optimize process development and control.

“Part of the problem comes from the back-end packaging and assembly process, but another part of the problem can originate from the quality of the wafer itself, which is determined during the front-end process,” says PDF’s Yu. “An effective ML model needs to incorporate both front-end and back-end information, including data from equipment sensors, metrology, and structured test information, to make accurate predictions and take proactive actions to correct the process.”

Standardizing new materials and processes: The development of future information and communication technologies hinges on the creation of new standards and validation methods. Current reference materials and calibration services do not meet the requirements for next-generation materials and processes, such as those used in advanced packaging and heterogeneous integration. This gap hampers the industry’s ability to innovate and maintain competitive production capabilities.

Metrology to enhance security and provenance of components and products: With the increasing complexity of the semiconductor supply chain, there is a need for metrology solutions that can ensure the security and provenance of components and products. This involves developing methods to trace materials and processes throughout the manufacturing lifecycle to prevent counterfeiting and ensure compliance with regulatory standards.

“The focus on security and sharing changes the supplier relationship into more of a partnership and less of a confrontation,” says Chaffee. “Historically, there’s always been a concern of data flowing across that boundary. People are very protective about their process, and other people are very protective about their product. But once you start pushing into the deep sub-micron space, those barriers have to come down. The die are too expensive for them not to communicate, but they can still do so while protecting their IP. Companies are starting to realize that by sharing parametric test information securely, they can achieve better yield management and process optimization without compromising their intellectual property.”

Conclusion

Advancements in metrology and testing are pivotal for the semiconductor industry’s continued growth and innovation. The integration of AI/ML, IoT, and big data analytics is transforming how manufacturers approach process control and yield improvement. As adoption of Industry 4.0 grows, the role of metrology will become even more critical in ensuring the efficiency, quality, and reliability of semiconductor devices. And by leveraging these advanced technologies, semiconductor manufacturers can achieve higher yields, reduce costs, and maintain the precision required in this competitive industry.

With continuous improvements and the integration of smart technologies, the semiconductor industry will keep pushing the boundaries of innovation, leading to more robust and capable electronic devices that define the future of technology. The journey toward a fully realized Industry 4.0 is ongoing, and its impact on semiconductor manufacturing undoubtedly will shape the future of the industry, ensuring it stays at the forefront of global technological advancements.

“Anytime you have new packaging technologies and process technologies that are evolving, you have a need for metrology,” says Perkins. “When you are ramping up new processes and need to make continuous improvements for yield, that is when you see the biggest need for new metrology solutions.”

The post Metrology And Inspection For The Chiplet Era appeared first on Semiconductor Engineering.

Predicting And Preventing Process Drift

Semiconductor Engineering

Od: Gregory Haley

22. Duben 2024 v 09:05

Increasingly tight tolerances and rigorous demands for quality are forcing chipmakers and equipment manufacturers to ferret out minor process variances, which can create significant anomalies in device behavior and render a device non-functional.

In the past, many of these variances were ignored. But for a growing number of applications, that’s no longer possible. Even minor fluctuations in deposition rates during a chemical vapor deposition (CVD) process, for example, can lead to inconsistencies in layer uniformity, which can impact the electrical isolation properties essential for reliable circuit operation. Similarly, slight variations in a photolithography step can cause alignment issues between layers, leading to shorts or open circuits in the final device.

Some of these variances can be attributed to process error, but more frequently they stem from process drift — the gradual deviation of process parameters from their set points. Drift can occur in any of the hundreds of process steps involved in manufacturing a single wafer, subtly altering the electrical properties of chips and leading to functional and reliability issues. In highly complex and sensitive ICs, even the slightest deviations can cause defects in the end product.

“All fabs already know drift. They understand drift. They would just like a better way to deal with drift,” said David Park, vice president of marketing at Tignis. “It doesn’t matter whether it’s lithography, CMP (chemical mechanical polishing), CVD or PVD (chemical/physical vapor deposition), they’re all going to have drift. And it’s all going to happen at various rates because they are different process steps.”

At advanced nodes and in dense advanced packages, where a nanometer can be critical, controlling process drift is vital for maintaining high yield and ensuring profitability. By rigorously monitoring and correcting for drift, engineers can ensure that production consistently meets quality standards, thereby maximizing yield and minimizing waste.

“Monitoring and controlling hundreds of thousands of sensors in a typical fab requires the ability to handle petabytes of real-time data from a large variety of tools,” said Vivek Jain, principal product manager, smart manufacturing at Synopsys. “Fabs can only control parameters or behaviors they can measure and analyze. They use statistical analysis and error budget breakdowns to define upper control limits (UCLs) and lower control limits (LCLs) to monitor the stability of measured process parameters and behaviors.”

Dialing in legacy fabs
In legacy fabs — primarily 200mm — most of the chips use 180nm or older process technology, so process drift does not need to be as precisely monitored as in the more advanced 300mm counterparts. Nonetheless, significant divergence can lead to disparities in device performance and reliability, creating a cascade of operational challenges.

Manufacturers operating at older technology nodes might lack the sophisticated, real-time monitoring and control methods that are standard in cutting-edge fabs. While the latter have embraced ML to predict and correct for drift, many legacy operations still rely heavily on periodic manual checks and adjustments. Thus, the management of process drift in these settings is reactive rather than proactive, making changes after problems are detected rather than preventing them.

“There is a separation between 300-millimeter and 200-millimeter fabs,” said Park. “The 300-millimeter guys are all doing some version of machine learning. Sometimes it’s called advanced process control, and sometimes it’s actually AI-powered process control. For some of the 200-millimeter fabs with more mature process nodes, they basically have a recipe they set and a bunch of technicians looking at machines and looking at the CDs. When the drift happens, they go through their process recipe and manually adjust for the out-of-control processes, and that’s just what they’ve always done. It works for them.”

For these older fabs, however, the repercussions of process drift can be substantial. Minor deviations in process parameters, such as temperature or pressure during the deposition or etching phases, gradually can lead to changes in the physical structure of the semiconductor devices. Over time, these minute alterations can compound, resulting in layers of materials that deviate from their intended characteristics. Such deviations affect critical dimensions and ultimately can compromise the electrical performance of the chip, leading to slower processing speeds, higher power consumption, or outright device failure.

The reliability equation is equally impacted by process drift. Chips are expected to operate consistently over extended periods, often under a range of environmental conditions. However, when process-induced variability can weaken the device’s resilience, precipitating early wear-out mechanisms and reducing its lifetime. In situations where dependability is non-negotiable, such as in automotive or medical applications, those variations can have dire consequences.

But with hundreds of process steps for a typical IC, eliminating all variability in fabs is simply not feasible.

“Process drift is never going to not happen, because the processes are going to have some sort of side effect,” Park said. “The machines go out of spec and things like pumps and valves and all sorts of things need to be replaced. You’re still going to have preventive maintenance (PM). But if the critical dimensions are being managed correctly, which is typically what triggers the drift, you can go a longer period of time between cleanings or the scheduled PMs and get more capacity.”

Process drift pitfalls
Managing process drift in semiconductor manufacturing presents several complex challenges. Hysteresis, for example, is a phenomenon where the output of a process varies not solely because of current input conditions, but also based on the history of the states through which the process already has passed. This memory effect can significantly complicate precision control, as materials and equipment might not reset to a baseline state after each operational cycle. Consequently, adjustments that were effective in previous cycles may not yield the same outcomes due to accumulated discrepancies.

One common cause of hysteresis is thermal cycling, where repeated heating and cooling create mechanical stresses. Those stresses can be additive, releasing inconsistently based on temperature history. That, in turn, can lead to permanent changes in the output of a circuit, such as a voltage reference, which affects its precision and stability.

In many field-effect transistors (FETs), hysteresis also can occur due to charge trapping. This happens when charges are captured in ‘trap states’ within the semiconductor material or at the interface with another material, such as an oxide layer. The trapped charges then can modulate the threshold voltage of the device over time and under different electrical biases, potentially leading to operational instability and variability in device performance.

Human factors also play a critical role in process drift, with errors stemming from incorrect settings adjustments, mishandling of materials, misinterpretation of operational data, or delayed responses to process anomalies. Such errors, though often minor, can lead to substantial variations in manufacturing processes, impacting the consistency and reliability of semiconductor devices.

“Once in production, the biggest source of variability is human error or inconsistency during maintenance,” said Russell Dover, general manager of service product line at Lam Research. “Wet clean optimization (WCO) and machine learning through equipment intelligence solutions can help address this.”

The integration of new equipment into existing production lines introduces additional complexities. New machinery often features increased speed, throughput, and tighter tolerances, but it must be integrated thoughtfully to maintain the stringent specifications required by existing product lines. This is primarily because the specifications and performance metrics of legacy chips have been long established and are deeply integrated into various applications with pre-existing datasheets.

“From an equipment supplier perspective, we focus on tool matching,” said Dover. “That includes manufacturing and installing tools to be identical within specification, ensuring they are set up and running identically — and then bringing to bear systems, tooling, software and domain knowledge to ensure they are maintained and remain as identical as possible.”

The inherent variability of new equipment, even those with advanced capabilities, requires careful calibration and standardization.

“Some equipment, like transmission electron microscopes, are incredibly powerful,” said Jian-Min Zuo, a materials science and engineering professor at the University of Illinois’ Grainger College of Engineering. “But they are also very finicky, depending on how you tune the machine. How you set it up under specific conditions may vary slightly every time. So there are a number of things that can be done when you try to standardize those procedures, and also standardize the equipment. One example is to generate a curate, like a certain type of test case, where you can collect data from different settings and make sure you’re taking into account the variability in the instruments.”

Process drift solutions
As semiconductor manufacturers grapple with the complexities of process drift, a diverse array of strategies and tools has emerged to address the problem. Advanced process control (APC) systems equipped with real-time monitoring capabilities can extract patterns and predictive insights from massive data sets gathered from various sensors throughout the manufacturing process.

By understanding the relationships between different process variables, APC can predict potential deviations before they result in defects. This predictive capability enables the system to make autonomous adjustments to process parameters in real-time, ensuring that each process step remains within the defined control limits. Essentially, APC acts as a dynamic feedback mechanism that continuously fine-tunes the production process.

Fig. 1: Reduced process drift with AI/ML advanced process control. Source: Tignis

Fig. 1: Reduced process drift with AI/ML advanced process control. Source: Tignis

While APC proactively manages and optimizes the process to prevent deviations, fault detection and classification (FDC) reacts to deviations by detecting and classifying any faults that still occur.

FDC data serves as an advanced early-warning system. This system monitors the myriad parameters and signals during the chip fabrication process, rapidly detecting any variances that could indicate a malfunction or defect in the production line. The classification component of FDC is particularly crucial, as it does more than just flag potential issues. It categorizes each detected fault based on its characteristics and probable causes, vastly simplifying the trouble-shooting process. This allows engineers to swiftly pinpoint the type of intervention needed, whether it’s recalibrating instruments, altering processing recipes, or conducting maintenance repairs.

Statistical process control (SPC) is primarily focused on monitoring and controlling process variations using statistical methods to ensure the process operates efficiently and produces output that meets quality standards. SPC involves plotting data in real-time against control limits on control charts, which are statistically determined to represent the expected normal process behavior. When process measurements stray outside these control limits, it signals that the process may be out of control due to special causes of variation, requiring investigation and correction. SPC is inherently proactive and preventive, aiming to detect potential problems before they result in product defects.

“Statistical process control (SPC) has been a fundamental methodology for the semiconductor industry almost from its very foundation, as there are two core factors supporting the need,” said Dover. “The first is the need for consistent quality, meaning every product needs to be as near identical as possible, and second, the very high manufacturing volume of chips produced creates an excellent workspace for statistical techniques.”

While SPC, FDC, and APC might seem to serve different purposes, they are deeply interconnected. SPC provides the baseline by monitoring process stability and quality over time, setting the stage for effective process control. FDC complements SPC by providing the tools to quickly detect and address anomalies and faults that occur despite the preventive measures put in place by SPC. APC takes insights from both SPC and FDC to adjust process parameters proactively, not just to correct deviations but also to optimize process performance continually.

Despite their benefits, integrating SPC, FDC and APC systems into existing semiconductor manufacturing environments can pose challenges. These systems require extensive configuration and tuning to adapt to specific manufacturing conditions and to interface effectively with other process control systems. Additionally, the success of these systems depends on the quality and granularity of the data they receive, necessitating high-fidelity sensors and a robust data management infrastructure.

“For SPC to be effective you need tight control limits,” adds Dover. “A common trap in the world of SPC is to keep adding control charts (by adding new signals or statistics) during a process ramp, or maybe inheriting old practices from prior nodes without validating their relevance. The result can be millions of control charts running in parallel. It is not a stretch to state that if you are managing a million control charts you are not really controlling much, as it is humanly impossible to synthesize and react to a million control charts on a daily basis.”

This is where AI/ML becomes invaluable, because it can monitor the performance and sustainability of the new equipment more efficiently than traditional methods. By analyzing data from the new machinery, AI/ML can confirm observations, such as reduced accumulation, allowing for adjustments to preventive maintenance schedules that differ from older equipment. This capability not only helps in maintaining the new equipment more effectively but also in optimizing the manufacturing process to take full advantage of the technological upgrades.

AI/ML also facilitate a smoother transition when integrating new equipment, particularly in scenarios involving ‘copy exact’ processes where the goal is to replicate production conditions across different equipment setups. AI and ML can analyze the specific outputs and performance variations of the new equipment compared to the established systems, reducing the time and effort required to achieve optimal settings while ensuring that the new machinery enhances production without compromising the quality and reliability of the legacy chips being produced.

AI/ML
Being more proactive in identifying drift and adjusting parameters in real-time is a necessity. With a very accurate model of the process, you can tune your recipe to minimize that variability and improve both quality and yield.

“The ability to quickly visualize a month’s worth of data in seconds, and be able to look at windows of time, is a huge cost savings because it’s a lot more involved to get data for the technicians or their process engineers to try and figure out what’s wrong,” said Park. “AI/ML has a twofold effect, where you have fewer false alarms, and just fewer alarms in general. So you’re not wasting time looking at things that you shouldn’t have to look at in the first place. But when you do find issues, AI/ML can help you get to the root cause in the diagnostics associated with that much more quickly.”

When there is a real alert, AI/ML offers the ability to correlate multiple parameters and inputs that are driving that alert.

“Traditional process control systems monitor each parameter separately or perform multivariate analysis for key parameters that require significant effort from fab engineers,” adds Jain. “With the amount of fab data scaling exponentially, it is becoming humanly impossible to extract all the actionable insights from the data. Machine learning and artificial intelligence can handle big data generated within a fab to provide effective process control with minimal oversight.”

AI/ML also can look for more other ways of predicting when the drift is going to take your process out of specification. Those correlations can be bivariate and multivariate, as well as univariate. And a machine learning engine that is able to sift through tremendous amounts of data and a larger number of variables than most humans also can turn up some interesting correlations.

“Another benefit of AI/ML is troubleshooting when something does trigger an alarm or alert,” adds Park. “You’ve got SPC and FDC that people are using, and a lot of them have false positives, or false alerts. In some cases, it’s as high as 40% of the alerts that you get are not relevant for what you’re doing. This is where AI/ML becomes vital. It’s never going to take false alerts to zero, but it can significantly reduce the amount of false alerts that you have.”

Engaging with these modern drift solutions, such as AI/ML-based systems, is not mere adherence to industry trends but an essential step towards sustainable semiconductor production. Going beyond the mere mitigation of process drift, these technologies empower manufacturers to optimize operations and maintain the consistency of critical dimensions, allowed by the intelligent analysis of extensive data and automation of complex control processes.

Conclusion
Monitoring process drift is essential for maintaining quality of the device being manufactured, but it also can ensure that the entire fabrication lifecycle operates at peak efficiency. Detecting and managing process drift is a significant challenge in volume production because these variables can be subtle and may compound over time. This makes identifying the root cause of any drift difficult, particularly when measurements are only taken at the end of the production process.

Combating these challenges requires a vigilant approach to process control, regular equipment servicing, and the implementation of AI/ML algorithms that can assist in predicting and correcting for drift. In addition, fostering a culture of continuous improvement and technological adaptation is crucial. Manufacturers must embrace a mindset that prioritizes not only reactive measures, but also proactive strategies to anticipate and mitigate process drift before it affects the production line. This includes training personnel to handle new technologies effectively and to understand the dynamics of process control deeply. Such education enables staff to better recognize early signs of drift and respond swiftly and accurately.

Moreover, the integration of comprehensive data analytics platforms can revolutionize how fabs monitor and analyze the vast amounts of data they generate. These platforms can aggregate data from multiple sources, providing a holistic view of the manufacturing process that is not possible with isolated measurements. With these insights, engineers can refine their process models, enhance predictive maintenance schedules, and optimize the entire production flow to reduce waste and improve yields.

Related Reading
Tackling Variability With AI-Based Process Control
How AI in advanced process control reduces equipment variability and corrects for process drift.

The post Predicting And Preventing Process Drift appeared first on Semiconductor Engineering.

UCIe Goes Back To The Drawing Board

Semiconductor Engineering

Od: Gregory Haley

22. Únor 2024 v 09:08

The integration of multiple dies within a single package increasingly is viewed as the next evolution for extending Moore’s Law, but it also presents myriad challenges — particularly in achieving a universally accepted standard integrating plug-and-play chiplets from different vendors.

“In some respects, people are already doing this,” says Debendra Das Sharma, Intel senior fellow and chair of the UCIe Consortium. “They’re putting multiple dies on the same package, and we’ve been doing it for decades back to what was multi-chip modules (MCMs). And if you look in our mainstream CPUs today, they’re all multiple chips on the same package.”

Combining more than one chip in a package becomes a lot more complicated, however, when those chips have different functions or come from different vendors or foundries. That’s where a standard like UCIe becomes necessary.

“For most of the multi-chip products that are in the market, the same company is designing and providing the multiple dies, so they know exactly how they talk to each other and how to divide or partition the chip,” says Vik Chaudhry, senior director of product marketing and business development at Amkor. “That makes it a little easier to understand how one part talks to the other. What UCIe is trying to do is standardize that interconnect between multiple vendors.”

While other protocols like Bunch of Wires (BoW) have made significant strides in recent years and are still being developed, UCIe stands out for its backing by many of the largest chip manufacturers and its support for all major packaging technologies, including organic substrates, silicon interposers, and RDL fan-outs.

Fig. 1: Chiplet diagram with UCIe interconnect highlighted. Source: Keysight

But the move toward UCIe compatibility necessitates more than a mere afterthought in the chip creation process. It requires a foundational shift back to the drawing board, where compatibility must be conceived as an integral component of the chip, not retrofitted as an expedient solution. As this standard evolves, it has become increasingly apparent that for chiplets to truly embrace UCIe, the blueprint for their design must be reimagined from the ground up.

“UCIe is a layout,” says Chaudry. “It’s designed. But keep in mind, these chiplets can be from different fab nodes. One could be 5nm, another could be 3nm, and third could be 14nm. Somehow you have to connect these dies together. You need to be compatible in terms of how much space you have to run the routes, and that’s what UCIe is addressing.”

The transition to UCIe is not merely about different vendors adapting to a new standard. It requires a willingness among manufacturers throughout the industry to align their design and production processes with a common protocol that is still, in many respects, a work in progress.

While it is commonly assumed that chiplets plus advanced packaging represent the next evolution for extending Moore’s Law, the lack of a fully defined standard, coupled with the uncertainty surrounding the integration with existing technologies, means investing in new designs for UCIe is currently limited to the largest players in the market.

“Anytime you put multiple dies on a substrate or interposer, it’s challenging,” adds Chaudhry. “As we are seeing AI come into the picture, we are seeing a lot of vendors putting multiple dies on a chip, and not just 3 or 4, but 8, 10, or 12 dies. The complexity exponentially grows as you have more and more dies on the same interposer or substrate. You also have to test everything in between, and that increases the complexity and cost. That’s a huge challenge for anybody, and right now only a few companies in the world are capable of committing those kinds of resources and those kinds of expenses to put a line together.”

Moreover, the adoption of UCIe still must overcome significant hurdles in terms of scalability, compatibility with existing systems, and ensuring that the cost implications do not outweigh the benefits.

The chiplet evolution
Large chipmakers have been constrained by the size of the reticle field for at least the last several process nodes, which sharply limited the number of features that could be crammed onto a planar SoC. Today, with node shrinks becoming more costly and challenging, the best solution available is to decompose the SoC into individual blocks, or chiplets.

“Once the dies become really big, you’re up against the reticle limit,” says Intel’s Das Sharma. “That’s where you will see a lot of people deploying chiplets. You’re basically having multiple sets of chips being packaged together to deliver a certain set of functionality.”

Take, for instance, the leap to 50 Tb per second switches that are challenging the limits of reticle size. There’s a growing need to dissect and distribute the functionality of these chips across multiple components. Whether it’s the I/O, memory, or SRAM, the key lies in strategically breaking down the SoC into smaller units. This not only makes the manufacturing process more feasible, but also opens doors to more innovative and efficient design architectures.

It also provides some immediate benefits. Smaller dies yield better than larger ones, which is why in 2012 Xilinx split its 28nm FPGA into four different dies, connected through an interposer. It also provides room to grow, because the individual chiplets are still well below the reticle limit.

But all of the early implementations were homogeneous. They were all developed by the same vendor using the same process technology. A big benefit of advanced packaging is the ability to combine heterogeneous chiplets in the same package, allowing analog circuits and less-critical features to be developed at whatever process node makes sense. This is the challenge facing large chipmakers, foundries, and OSATs today, and it’s one that has not yet been fully solved.

Nevertheless, the chip industry agrees on one thing. There needs to be a common way to connect all of these chiplets together, and this is where UCIe fits into the picture.

The UCIe standard
Achieving a consensus on the electrical characteristics that underpin the UCIe is akin to orchestrating a symphony with a multitude of instruments, each with its own acoustic signature. Ensuring that chiplets from different corners of the industry can connect and communicate efficiently necessitates bridging gaps in voltage levels, signal timing, and power distribution.

In March, 2022, the UCIe consortium released UCIe 1.0 that included specifications for a standardized physical die-to-die interface designed to facilitate seamless communication between chiplets, regardless where they were manufactured or by whom. The specifications encompassed key aspects, such as electrical properties, physical dimensions, and protocols necessary for ensuring compatibility and efficient data transfer between diverse chip components.

“On advanced packages at 45 microns, the numbers are pretty stellar,” says Das Sharma. “You have 188 gigabytes per second per square millimeter as a starting point, up to 1.35 terabytes per second per square millimeter. People will have a hard time even absorbing that kind of bandwidth and processing it.”

UCIe 1.0 uses a layered protocol approach. The physical layer underpins the protocol stack, dedicated to defining and managing electrical signaling, such as clock synchronization and link training, while also incorporating sideband communication channels essential for non-data interactions between chiplets.

At the heart of UCIe’s mechanics is the Die-to-Die (D2D) adapter. This crucial interface acts as the gatekeeper, managing link state and facilitating negotiation parameters for chiplets, crucial for establishing reliable chiplet communication. It optionally extends a safeguard for data integrity through mechanisms like cyclic redundancy check (CRC) and link-level retry capabilities. This not only ensures accuracy in high-speed data transfer, but also aligns different chiplet protocols by providing an arbitration system enabling multiple chips to interact efficiently.

“UCIe is pretty flexible in that way,” says Chaudhry. “It supports your PCIe protocols, XML protocol, or streaming, so you can decide which protocol you want to support. And it has different data rates that it supports. It’s the lowest common denominator that everybody will support. If you’re on a 3nm process, you can support a much higher data rate, but if the other chiplet is at a different process node, then both the parts will support the basic lowest common denominator of the spec, and then you can talk on that.”

UCIe also incorporates strategies to mitigate interconnect defects, such as stuck-at faults and signal discontinuities. Stipulations within UCIe include the implementation of auxiliary pathways, furnishing a means to maintain connectivity if the primary lanes fail. This redundancy helps sustain system functionality by providing avenues for fault tolerance and repair.

UCIe also embraces existing standards such as PCI Express (PCIe) and Compute Express Link (CXL) natively, ensuring a broad resonance across the industry by capitalizing on these well-established protocols. The layered approach of UCIe also encompasses comprehensive usage models.

In August 2023, the consortium published UCIe version 1.1, extending reliability mechanisms to more protocols and supporting additional usage models. These enhancements are not merely incremental. They are geared towards pivotal segments such as automotive, which is gravitating toward chiplets.

One key area where the evolution from UCIe 1.0 to 1.1 becomes evident is in the standard’s preventive monitoring features. UCIe 1.1 expands the protocol with new registers designed to capture detailed Eye Margin information — viewing both width and height — which provides standardized reporting formats and proactive link health monitoring. Rather than reinventing the wheel, UCIe 1.1 leverages the existing periodic parity Flit injection mechanism from version 1.0, enhancing error detection and reporting capabilities through a new error log register. That, in turn, allows for improved assessment of link repair necessities. UCIe 1.1 also offers enhancements for compliance testing.

Another notable aspect is the advent of new and emerging usages, particularly with streaming protocols. Whereas UCIe 1.0’s support for such protocols was restricted to Raw Mode, UCIe 1.1 extends the utility of the die-to-die (D2D) adapter on the FDI interface to streaming protocols. This extension enables a blend of CRC retry power management features and facilitates the coexistence of multiple protocols.

UCIe 1.1 also considers cost optimization for advanced packaging solutions in anticipation of shrinking bump pitches and the advent of 3D integration. The introduction of additional column arrangements in UCIe 1.1 creates broader opportunities for mix-and-match dies.

“In a chiplet environment, the dies are very close to each other and your shoreline is very limited,” says Chaudhry. “You have limited space to connect the dies, and how the number of pins are connected, facing each other, that becomes critical. That is one thing that UCIe is addressing. What should be the pin location? Whether it’s 6-, 8-, or 16-column, how do you arrange it so that when one vendor has an 8-column configuration, they can talk to one with a 12 column configuration and connect to it physically, not just in terms of pins, but also connectivity and shoreline compatibility?”

Designing for interoperability
There are a number of technical hurdles that still stand in the way of the widespread adoption of UCIe. These include a need for precise electrical conformity, predictable signaling realms, and systematic physical interconnects catering to a variety of nodes and manufacturing processes.

“You can also have HBM in there, which can be very tall compared to a single ASIC,” says Amkor’s Chaudhry. “How do you address those height differences? A lot of different issues come into play when you’re putting different dies and different chiplets together.”

Thermal management is also a key element for high-density packaging. Disparate process nodes inevitably present distinct power profiles and heat dissipation characteristics. Bridging these gaps necessitates innovative heat distribution methodologies and sophisticate warpage control to ensure structural integrity and reliable function in complex modules.

“There are a lot of challenges in thermal,” adds Chaudhry. “When you have two dies from different process nodes, how do you make sure that you have a way to dissipate the power equally? Those are some of the challenges as we go along and there’s no general solution to that yet. Those are kind of things that the consortium is looking at right now.”

Continued evolution
Another goal of the UCIe consortium is to ensure that anyone developing a chiplet today will still be able to use that design five years from now, despite progress in the standard during that time.

“It will absolutely evolve,” adds Chaudhry. “PCI did the same thing. They are on Gen 5 or Gen 6 now. USB is the same way with USB 4.0 coming soon. CXL is at 3.1. We expect the same thing to happen to UCIe. It will continuously improve and come up with new and more flexible solutions that our members can adopt.”

“The more people get involved, the more they’re going to start tweaking things,” adds Das Sharma. “Some of them are not going to work out, and some of them are going to work out really well. This is a multi-decade journey, and the key is to learn and adapt and keep moving on.”

Conclusion
The UCIe initiative aims to revolutionize chip package interconnectivity by emulating the success of Peripheral Component Interconnect Express (PCIe) at the PCB level. By facilitating direct inter-die connections within the chip package, UCIe endeavors to drastically cut power usage, enhance bandwidth efficiency, and, ultimately, reduce production costs.

“The good thing about UCIe is that it’s an open standard,” says Chaudhry. “In all, there are about 120 members, and all of them are working together. There are six different working groups that range from mechanical to electrical to security to software and marketing, where they are bringing up new things as they are developing their chiplet-based designs. A lot of things that have happened between UCIe 1.0 and 1.1 are basically due to their input.”

—Ed Sperling contributed to this report.

The post UCIe Goes Back To The Drawing Board appeared first on Semiconductor Engineering.