Zobrazení pro čtení

Jsou dostupné nové články, klikněte pro obnovení stránky.

Achieving Zero Defect Manufacturing Part 2: Finding Defect Sources

Od: Prasad Bachiraju

6. Srpen 2024 v 09:07

Semiconductor manufacturing creates a wealth of data – from materials, products, factory subsystems and equipment. But how do we best utilize that information to optimize processes and reach the goal of zero defect manufacturing?

This is a topic we first explored in our previous blog, “Achieving Zero Defect Manufacturing Part 1: Detect & Classify.” In it, we examined real-time defect classification at the defect, die and wafer level. In this blog, the second in our three-part series, we will discuss how to use root cause analysis to determine the source of defects. For starters, we will address the software tools needed to properly conduct root cause analysis for a faster understanding of visual, non-visual and latent defect sources.

About software

The software platform fabs choose impacts how well users are able to integrate data, conduct database analytics and perform server-side and real-time analytics. Manufacturers want the ability to choose a platform that can scale by data volume, type and multisite integration. In addition, all of this data – whether it is coming from metrology, inspection or testing – must be normalized before fabs can apply predictive modeling and machine learning based analytics to find the root cause of defects and failures. This search, however, goes beyond a simple examination of process steps and tools; manufacturers also need a clear understanding of each device’s genealogy. In addition, fabs should employ an AI-based yield optimizer capable of running multiple models and offering potential optimization measures that can be taken in the factory to improve the process.

Now that we have discussed software needs, we will turn our attention to two use cases to further our examination of root cause analysis in zero defect manufacturing.

Root Cause Case No. 1

The first root cause value case we would like to discuss involves the integration of wafer probe, photoluminescence and epitaxial (epi) data. Previously, integrating these three kinds of data was not possible because the identification for wafers and lots – pre- and post-epi – were generally not linked. Wafers and lots were often identified by entirely different names before and after the epi step. For reasons that do not need to be explained, this was a huge hindrance to advancing the goal of zero defect manufacturing because the impact of the epi process on yield was not detected in a timely manner, resulting in higher defectivity and yield loss.

But the challenge is not as simple as identification and naming practices. Typical wafer ID trackers are not applied prior to the post-epi step because of technical and logistical constraints. The solution is for fabs to employ defect and yield analytics software that will enable genealogy that can link data from the epi and pre-epi processes to post-epi processes. The real innovation occurs when the genealogical information is normalized and interpolated with electrical test data. Once integrated, this data offers users a more complete understanding of where yield limiting events are occurring.

Fig. 1: Photoluminescence map (left) and electrical test performance by epi tool (right).

For example, let us consider the following scenario: in figure 1 (left) we show a group of dies that negatively affect performance on the upper left edge of the wafer. Through more traditional measures, this pocket of defectivity may have gone unnoticed, allowing for bad die to move forward in the process. But by applying integrated data, genealogical information and electrical test data, this trouble-plagued area was identified down to the epi tool and chamber (figure 1, right), and the defective material was prevented from going forward in the process. As significant as this is, with the right software platform this approach enables root cause analysis to be conducted in minutes, not days.

Now, onto the second use case in which we look at how to problem solve within the supply chain.

Root Cause Case No. 2

During final test and measurement, chips sometimes fail. In many cases, the faulty chips were previously determined to be good chips and were advanced forward in the process as a result of combining multiple chips coming from different products, lots, or wafers. The important thing here is to understand why this happens.

When there is a genealogy model in a yield software platform, fabs are able to pick the lots and wafers where bad chips come from and then run this information through pattern analysis software. In one particular scenario (figure 2), users were able to apply pattern analysis software to discover that all of the defective die arose from a spin coater issue, in this case, a leak negatively impacting the underbump metallization area following typical preventive maintenance measures.

To compensate for this, the team used integrated analytics to create a fault detection and classification (FDC) model to identify similar circumstances going forward. In this case, the FDC model monitors the suction power of the spin coater. If suction power for more than 10 consecutive samples are above the set limit, alarms are triggered and an appropriate Out of Control Action Plan (OCAP) process is executed that includes notification to tool owner.

Fig. 2: Proactive zero defect manufacturing at-a-glance.

The above explains how fabs are able to turn reactive root cause analytics into proactive monitoring. With such an approach, manufacturers can monitor for this and other issues and avoid the advancement of future defective die. Furthermore, the number of defect signatures that can be monitored inline can be as high as 40 different signatures, if not more. And in case these defects are missed at the process level, they can be identified at the inspection level or post-inspection, avoiding hundreds of issues further along in the process.

Conclusion

Zero defect manufacturing is not so much of a goal as it is a commitment to root out defects before they happen. To accomplish this, fabs need a wealth of data from the entire process to achieve a clear picture of what is going wrong, where it is going wrong and why it is going wrong. In this blog, we offered specific scenarios where root cause analysis was used to find defects across wafers and dies. However, these are just a few examples of how software can be used to find difficult-to-find defects. It can be beneficial in many different areas across the entire process, with each application further strengthening a fab’s efforts to employ a zero defect manufacturing approach, increasing yield and meeting the stringent requirements of some of the industry’s most advanced customers.

In our next blog, we will discuss how to detect dormant defects, use feedback and feedforward measures, and monitor the health of process control equipment. We hope you join us as we continue to explore methods for achieving zero defect manufacturing.

The post Achieving Zero Defect Manufacturing Part 2: Finding Defect Sources appeared first on Semiconductor Engineering.

Are You Ready For HBM4? A Silicon Lifecycle Management (SLM) Perspective

Semiconductor Engineering

Od: Faisal Goriawalla

6. Srpen 2024 v 09:05

Many factors are driving system-on-chip (SoC) developers to adopt multi-die technology, in which multiple dies are stacked in a three-dimensional (3D) configuration. Multi-die systems may make power and thermal issues more complex, and they have required major innovations in electronic design automation (EDA) implementation and test tools. These challenges are more than offset by the advantages of over traditional 2D designs, including:

Reducing overall area
Achieving much higher pin densities
Reusing existing proven dies
Mixing heterogeneous die technologies
Quickly creating derivative designs for new applications

One of the most common uses of multi-die design is the interconnection of memory stacks and processors such as CPUs and GPUs. High Bandwidth Memory (HBM) is a standard interface specifically for 3D-stacked DRAM dies. It was defined by the JEDEC Solid State Technology Association in 2013, followed by HBM2 in 2016 and HBM3 in 2022. Many multi-die projects have used this standard for caches in advanced CPUs and other system-on-chip (SoC) designs used in high-end applications such as data centers, high-performance computing (HPC), and artificial intelligence (AI) processing.

Fig. 1: Example of a current HBM-based SoC.

JEDEC recently announced that it is nearing completion of HBM4 and published preliminary specifications. HBM4 has been developed to enhance data processing rates while maintaining higher bandwidth, lower power consumption, and increased capacity per die/stack. The initial agreement calls for speed bins up to 6.4 Gbps, although this will increase as memory vendors develop new chips and refine the technology. This speed will benefit applications that require efficient handling of large datasets and complex calculations.

HBM4 is introducing a doubled channel count per stack over HBM3. The new version of the standard features a 2048-bit memory interface, as compared to 1024 bits in previous versions, as shown in figure 1. This intent is to double the number of bits without increasing the footprint of HBM memory stacks, thus doubling the interconnection density as well.

Different memory configurations will require various interposers to accommodate the differing footprints. HBM4 will specify 24 Gb and 32 Gb layers, with options for supporting 4-high, 8-high, 12-high and 16-high TSV stacks. As an example configuration, a 16-high based on 32 Gb layers will offer a capacity of 64 GB, which means that a processor with four memory modules can support 256 GB of memory with a peak bandwidth of 6.56 TB/s using an 8,192-bit interface.

The move from HBM3 to HBM4 will require further evolution in multi-die support across a wide range of EDA tools. The 2048-bit memory interface requires a significant increase in the number of through-silicon vias (TSVs) routed through a memory stack. This will mean shrinking the external bump pitch as the total number of micro bumps increases significantly. In addition, support for 16-high TSV stacks brings new complexity in wiring up an even larger number of DRAM dies without defects.

Test challenges are likely to be a dominant part of the transition. Any signal integrity issues after assembly and multi-die packaging become more difficult to diagnose and debug since probing is not feasible. Further, some defects may marginally pass production/manufacturing test but subsequently fail in the field. Thus, test of the future HBM4-based subsystem needs to be accomplished not just at production test but also in-system to account for aging-related defects.

Being able to monitor real-time data during mission mode operation in the field is greatly preferable to having to take the system offline for unplanned service. This “predictive maintenance” allows the end user to be proactive rather than reactive. HBM provides capabilities for in-system repair, for example swapping out a bad lane. Even if a defect requires physical hardware repair, detecting it before system failure enables scheduled maintenance rather than unplanned downtime.

As shown in figure 1, HBM systems typically have a base die that includes an HBM controller, a basic/fixed test engine provided by the DRAM vendor, and Direct Access (DA) ports. The new industry trend is for the base die to be manufactured on a standard logic process rather than the DRAM process. The SoC designer should include in the base die a flexible built-in self-test (BIST) engine that allows different algorithms to be used to trade off high coverage versus test time depending on the scenario.

This engine must be programmable to handle different latencies, address ranges, and timing of test operations that vary across DRAM vendors. It may also need to support post-package repair (PPR) for HBM DRAM to delay any “truck roll-out” for in-field service. The diagnostics performed by the BIST engine must be precise, showing the failing bank, row address, column address, etc. if there is a defect detected in the DRAM stack. Figure 2 shows an example.

Fig. 2: Example fault diagnosis for HBM stack.

As an industry leader in multi-die EDA and IP solutions, Synopsys provides all the technology needed for HBM manufacturing yield optimization and in-field silicon health monitoring. Signal Integrity Monitors (SIMs) are embedded in physical layer (PHY) IP blocks for on-demand signal quality measurement for interconnects. This allows users to create 1D eye diagrams for interconnect signals during both production test and in-field operation. SIMs measure timing margins, enable HBM lane test/repair, and mitigate against silent data corruption (SDC), part of an effective silicon lifecycle management (SLM) solution.

Synopsys SMS ext-RAM is a programmable and synthesizable engine that performs test, repair, and diagnostics for memory systems, including HBM. SMS ext-RAM ensures high test coverage and supports power-on self-test (POST) with the flexibility to run custom memory algorithms in-field. As shown in figure 2, it detects a wide range of defects in memory dies, including stuck-at faults, read destructive faults, write destructive faults, deceptive read destructive faults, and row hammering.

A real world case study of a project using HBM with the Synopsys solutions is available. These solutions are scaling to support the emerging HBM4 standard, ensuring continued success.

The post Are You Ready For HBM4? A Silicon Lifecycle Management (SLM) Perspective appeared first on Semiconductor Engineering.

Semiconductor Shifts In Automotive: Impact Of EV And ADAS Trends

Semiconductor Engineering

Od: Fisher Zhang

6. Srpen 2024 v 09:03

The integration of advanced driver assistance systems (ADAS) and the transition towards electric vehicles (EVs) are significantly transforming the automotive industry.

Modern vehicles, essentially computers on wheels, require substantially more semiconductors. In response, carmakers are forming stronger partnerships with semiconductor vendors – some are taking a page from tech giants like Apple and Samsung by designing their own chips, often following a fabless or outsourced production model.

While a deeper connection with semiconductor design helps automakers maintain design control and supply chain resilience, it also imposes substantial responsibility to understand and meet stringent automotive quality standards.

The crucial role of semiconductor testing

Testing is vital to meet the automotive industry’s demands for quality, cost-efficiency, and timely market entry. As carmakers delve into semiconductor design, they face new challenges. Advanced semiconductors, more complex by nature, require thorough testing to ensure automotive-grade quality.

The industry’s push towards smaller process nodes, like 5nm and below, further amplifies these challenges, necessitating early and continuous engagement with testing resources to maintain high standards without compromising time to market.

Zero defects commitment

The automotive industry’s commitment to zero defects underscores the critical importance of quality. This commitment is based on an analysis of the costs associated with testing versus the potentially catastrophic costs of failures, such as life-threatening malfunctions, costly recalls, and market delays.

These issues can dramatically impact revenue and market position, highlighting the need for rigorous testing. The exceptional quality requirements inherent to automotive standards are set to intensify with the increasing digital complexity of vehicles.

Given that automotive chips must perform reliably over a lifespan of 10 to 20 years, comprehensive testing protocols play an essential role in identifying and rectifying defects early, optimizing both cost and quality. This fundamental aspect of semiconductor manufacturing cements the principle that quality is not just a priority, but the paramount concern.

This commitment transcends the capabilities of even the most skilled engineers, requiring systematic and integrated testing processes to ensure chip reliability and performance under diverse conditions.

Collaboration is key

Collaboration between automakers and semiconductor manufacturers is crucial, fostering an environment where issues can be identified and addressed early in the development cycle.

These partnerships are vital for maintaining momentum in the face of rapid technological advancements and ensuring that the automotive industry can meet the high standards of safety, reliability, and performance expected by consumers.

This collaborative approach helps to optimize testing processes, to maintain stringent quality standards, and to protect time-to-market goals, preventing production delays and ensuring the continuous advancement of automotive technologies.

The post Semiconductor Shifts In Automotive: Impact Of EV And ADAS Trends appeared first on Semiconductor Engineering.

Leveraging AI To Efficiently Test AI Chips

Semiconductor Engineering

Od: Advantest

6. Srpen 2024 v 09:01

In the fast-paced world of technology, where innovation and efficiency are paramount, integrating artificial intelligence (AI) and machine learning (ML) into the semiconductor testing ecosystem has become of critical importance due to ongoing challenges with accuracy and reliability. AI and ML algorithms are used to identify patterns and anomalies that might not be discovered by human testers or traditional methods. By leveraging these technologies, companies can achieve higher accuracy in defect detection, ensuring that only the highest quality semiconductors reach the market. In addition, the industry is clamoring for increased efficiency and speed because AI-driven testing can significantly accelerate the testing process, analyzing vast amounts of data at speeds unattainable by human testers. This enables quicker turnaround times from design to production, helping companies meet market demands more effectively and stay ahead of competitors. Firms are also heavily invested in reducing costs. While the initial investment in AI/ML technology can be expansive, the long-term savings are irrefutable. With automated routine and complex testing processes, companies can reduce labor costs and minimize human error. Equally important, AI-enhanced testing can better predict potential failures before they occur, saving costs related to recalls and repairs.

The industry is now moving to chiplet-based modules, using a “Lego-like” approach to integrate CPU, GPU, cache, I/O, high-bandwidth memory (HBM), and other functions. In the rapidly evolving world of chiplets, the DUT is a complex multichip system with the integration of many devices in a single 2.5D or 3D package. Consequently, the tester can only access a subset of individual device pins. Even so, at each test insertion, the tester must be able to extract valuable data that is then used to optimize the current test insertion as well as other design, manufacturing, and test steps. With limited pin access, the tester must infer what is happening on unobservable nodes. To best achieve this goal, it is important to extract the most value possible out of the data that can be directly collected across all manufacturing and test steps, including data from on-chip sensors. The test flow in the chiplet world already includes PSV, wafer acceptance test (WAT), wafer sort (WS), final test (FT), burn-in, and SLT, and additional test insertions to account for the increased complexity of a package with multiple chiplets are not feasible from a cost perspective. Adding to the challenge, binning goes from performance-based to application-based. In this world, the tester must stay ahead of the system – the tester must be smarter than the complex system-under-test.

The ACS RTDI platform accelerates data analytics and AI/ML decision-making.

So, for these reasons and many more, the adoption of edge compute for ML test applications is well underway. Advantest’s ACS Real-Time Data Infrastructure (ACS RTDI) platform accelerates data analytics and AI/ML decision-making within a single integrated platform. It collects, analyzes, stores, and monitors semiconductor test data as well as data sources across the IC manufacturing supply chain while employing low-latency edge computing and analytics in a secure zero-trust environment. ACS RTDI minimizes the need for human intervention, streamlining overall data utilization across multiple insertions to boost quality, yield, and operational efficiencies. It includes Advantest’s ACS Edge HPC server, which works in conjunction with its V93000 and other ATE systems to handle computationally intensive workloads adjacent to the tester’s host controller.

A reliable, secure real-time data structure that integrates data sources across the IC manufacturing supply chain.

In this configuration, the ACS Edge provides low, consistent, and predictable latency compared with a data center-hosted alternative. It supports a user execution environment independent of the tester host controller to ease development and deployment. It also provides a reliable and secure real-time data infrastructure that integrates all data sources across the entire IC manufacturing supply chain, applying analytics models that enable real-time decision-making during production test.

The post Leveraging AI To Efficiently Test AI Chips appeared first on Semiconductor Engineering.