FreshRSS

Zobrazení pro čtení

Jsou dostupné nové články, klikněte pro obnovení stránky.

3.5D: The Great Compromise

The semiconductor industry is converging on 3.5D as the next best option in advanced packaging, a hybrid approach that includes stacking logic chiplets and bonding them separately to a substrate shared by other components.

This assembly model satisfies the need for big increases in performance while sidestepping some of the thorniest issues in heterogeneous integration. It establishes a middle ground between 2.5D, which already is in widespread use inside of data centers, and full 3D-ICs, which the chip industry has been struggling to commercialize for the better part of a decade.

A 3.5D architecture offers several key advantages:

  • It creates enough physical separation to effectively address thermal dissipation and noise.
  • It provides a way to add more SRAM into high-speed designs. SRAM has been the go-to choice for processor cache since the mid-1960s, and remains an essential element for faster processing. But SRAM no longer scales at the same rate as digital transistors, so it is consuming more real estate (in percentage terms) at each new node. And because the size of a reticle is fixed, the best available option is to add area by stacking chiplets vertically.
  • By thinning the interface between processing elements and memory, a 3.5D approach also can shorten the distances that signals need to travel and greatly improve processing speeds well beyond a planar implementation. This is essential with large language models and AI/ML, where the amount of data that needs to be processed quickly is exploding.

Chipmakers still point to fully integrated 3D-ICs as the best performing alternative to a planar SoC, but packing everything into a 3D configuration makes it harder to deal with physical effects. Thermal dissipation is probably the most difficult to contend with. Workloads can vary significantly, creating dynamic thermal gradients and trapping heat in unexpected places, which in turn reduce the lifespan and reliability of chips. On top of that, power and substrate noise become more problematic at each new node, as do concerns about electromagnetic interference.

“What the market has adopted first is high-performance chips, and those produce a lot of heat,” said Marc Swinnen, director of product marketing at Ansys. “They have gone for expensive cooling systems with a huge number of fans and heat sinks, and they have opted for silicon interposers, which arguably are some of the most expensive technologies for connecting chips together. But it also gives the highest performance and is very good for thermal because it matches the coefficient of thermal expansion. Thermal is one of the big reasons that’s been successful. In addition to that, you may want bigger systems with more stuff that you can’t fit on one chip. That’s just a reticle-size limitation. Another is heterogeneous integration, where you want multiple different processes, like an RF process or the I/O, which don’t need to be in 5nm.”

A 3.5D assembly also provides more flexibility to add additional processor cores, and higher yield because known good die can be manufactured and tested separately, a concept first pioneered by Xilinx in 2011 at 28nm.

3.5D is a loose amalgamation of all these approaches. It can include two to three chiplets stacked on top of each other, or even multiple stacks laid out horizontally.

“It’s limited vertical, and not just for thermal reasons,” said Bill Chen, fellow and senior technical advisor at ASE Group. “It’s also for performance reasons. But thermal is the limiting factor, and we’ve talked about many different materials to help with that — diamond and graphene — but that limitation is still there.”

This is why the most likely combination, at least initially, will be processors stacked on SRAM, which simplifies the cooling. The heat generated by high utilization of different processing elements can be removed with heat sinks or liquid cooling. And with one or more thinned out substrates, signals will travel shorter distances, which in turn uses less power to move data back and forth between processors and memory.

“Most likely, this is going to be logic over memory on a logic process,” said Javier DeLaCruz, fellow and senior director of Silicon Ops Engineering at Arm. “These are all contained within an SoC normally, but a portion of that is going to be SRAM, which does not scale very well from node to node. So having logic over memory and a logic process is really the winning solution, and that’s one of the better use cases for 3D because that’s what really shortens your connectivity. A processor generally doesn’t talk to another processor. They talk to each other through memory, so having the memory on a different floor with no latency between them is pretty attractive.”

The SRAM doesn’t necessarily have to be at the same node as the processors advanced node, which also helps with yield, and reliability. At a recent Samsung Foundry event, Taejoong Song, the company’s vice president of foundry business development, showed a roadmap of a 3.5D configuration using a 2nm chiplet stacked on a 4nm chiplet next year, and a 1.4nm chiplet on top of a 2nm chiplet in 2027.


Fig. 1: Samsung’s heterogeneous integration roadmap showing stacked DRAM (HBM), chiplets and co-packaged optics. Source: Samsung Foundry

Intel Foundry’s approach is similar in many ways. “Our 3.5D technology is implemented on a substrate with silicon bridges,” said Kevin O’Buckley, senior vice president and general manager of Foundry Services at Intel. “This is not an incredibly costly, low-yielding, multi-reticle form-factor silicon, or even RDL. We’re using thin silicon slices in a much more cost-efficient fashion to enable that die-to-die connectivity — even stacked die-to-die connectivity — through a silicon bridge. So you get the same advantages of silicon density, the same SI (signal integrity) performance of that bridge without having to put a giant monolithic interposer underneath the whole thing, which is both cost- and capacity-prohibitive. It’s working. It’s in the lab and it’s running.”


Fig. 2: Intel’s 3.5D model. Source: Intel

The strategy here is partly evolutionary — 3.5D has been in R&D for at least several years — and part revolutionary, because thinning out the interconnect layer, figuring out a way to handle these thinner interconnect layers, and how to bond them is still a work in progress. There is a potential for warping, cracking, or other latent defects, and dynamically configuring data paths to maximize throughput is an ongoing challenge. But there have been significant advances in thermal management on two- and three-chiplet stacks.

“There will be multiple solutions,” said C.P. Hung, vice president of corporate R&D at ASE. “For example, besides the device itself and an external heat sink, a lot of people will be adding immersion cooling or local liquid cooling. So for the packaging, you can probably also expect to see the implementation of a vapor chamber, which will add a good interface from the device itself to an external heat sink. With all these challenges, we also need to target a different pitch. For example, nowadays you see mass production with a 45 to 40 pitch. That is a typical bumping solution. We expect the industry to move to a 25 to 20 micron bump pitch. Then, to go further, we need hybrid bonding, which is a less than 10 micron pitch.”


Fig. 3: Today’s interposers support more than 100,000 I/Os at a 45m pitch. Source: ASE

Hybrid bonding solves another thorny problem, which is co-planarity across thousands of micro-bumps. “People are starting to realize that the densities we’re interconnecting require a level of flatness, which the guys who make traditional things to be bonded are having a hard time meeting with reasonable yield,” David Fromm, COO at Promex Industries. “That makes it hard to build them, and the thinking is, ‘So maybe we’ve got to do something else.’ You’re starting to see some of that.”

Taming the Hydra
Managing heat remains a challenge, even with all the latest advances and a 3.5D assembly, but the ability to isolate the thermal effects from other components is the best option available today, and possibly well into the future. Still, there are other issues to contend with. Even 2.5D isn’t easy, and a large percentage of the 2.5D implementations have been bespoke designs by large systems companies with very deep pockets.

One of the big remaining challenges is closing timing so that signals arrive at the right place at the right fraction of a second. This becomes harder as more elements are added into chips, and in a 3.5D or 3D-IC, this can be incredibly complex.

“Timing ultimately is the key,” said Sutirtha Kabir, R&D director at Synopsys. “It’s not guaranteed that at whatever your temperature is, you can use the same library for timing. So the question is how much thermal- and IR-aware timing do you have to do? These are big systems. You have to make sure your sign-off is converging. There are two things coming out. There are a bunch of multi-physics effects that are all clumped together. And yes, you could traditionally do one at a time as sign-off, but that isn’t going to work very well. You need to figure out how to solve these problems simultaneously. Ultimately, you’re doing one design. It’s not one for thermal, one for IR, one for timing. The second thing is the data is exploding. How do you efficiently handle the data, because you cannot wait for days and days of runtime and simulation and analysis?”

Physically assembling these devices isn’t easy, either. “The challenge here is really in the thermal, electrical, and mechanical connection of all these various die with different thicknesses and different coefficients of thermal expansion,” said Intel’s O’Buckley. “So with three die, you’ve got the die and an active base, and those are substantially thinned to enable them to come together. And then EMIB is in the substrate. There’s always intense thermal-mechanical qualification work done to manage not just the assembly, but to ensure in the final assembly — the second-level assembly when this is going through system-level card attach — that this thing stays together.”

And depending upon demands for speed, the interconnects and interconnect materials can change. “Hybrid bonding gives you, by far, the best signal and power density,” said Arm’s DeLaCruz. “And it gives you the best thermal conductivity, because you don’t have that underfill that you would otherwise have to put in between the die, which is a pretty significant barrier. This is likely where the industry will go. It’s just a matter of having the production base.”

Hybrid bonding has been used for years for image sensors using wafer-on-wafer connections. “The tricky part is going into the logic space, where you’re moving from wafer-on-wafer to a die-on-wafer process, which is more complex,” DeLaCruz said. “While it currently would cost more, that’s a temporary problem because there’s not much of an installed base to support it and drive down the cost. There’s really no expensive material or equipment costs.”

Toward mass customization
All of this is leading toward the goal of choosing chiplets from a menu and then rapidly connecting them into some sort of architecture that is proven to work. That may not materialize for years. But commercial chiplets will show up in advanced designs over the next couple years, most likely in high-bandwidth memory with a customized processor in the stack, with more following that path in the future.

At least part of this will depend on how standardized the processes for designing, manufacturing, and testing become. “We’re seeing a lot of 2.5D from customers able to secure silicon interposers,” said Ruben Fuentes, vice president for the Design Center at Amkor Technology. “These customers want to place their chiplets on an interposer, then the full module is placed on a flip-chip substrate package. We also have customers who say they either don’t want to use a silicon interposer or cannot secure them. They prefer an RDL interconnect with S-SWIFT or with S-Connect, which serves as an interposer in very dense areas.”

But with at least a third of these leading designs only for internal use, and the remainder confined to large processor vendors, the rest of the market hasn’t caught up yet. Once it does, that will drive economies of scale and open the door to more complete assembly design kits, commercial chiplets, and more options for customization.

“Everybody is generally going in the same direction,” said Fuentes. “But not everything is the same height. HBMs are pre-packaged and are taller than ICs. HBMs could have 12 or 16 ICs stacked inside. It makes a difference from a co-planarity and thermal standpoint, and metal balancing on different layers. So now vendors are having a hard time processing all this data because suddenly you have these huge databases that are a lot bigger than the standard packaging databases. We’re seeing bridges, S-Connect, SWIFT, and then S-SWIFT. This is new territory, and we’re seeing a performance gap in the packaging tools. There’s work that needs to be done here, but software vendors have been very proactive in finding solutions. Additionally, these packages need to be routed. There is limited automated routing, so a good amount of interactive routing is still required, so it takes a lot of time.”


Fig. 4: Packaging roadmap showing bridge and hybrid bonding connections for modules and chiplets, respectively. Source: Amkor Technology

What’s missing
The key challenges ahead for 3.5D are proven reliability and customizability — requirements that are seemingly contradictory, and which are beyond the control of any single company. There are four major pieces to making all of this work.

EDA is the first important piece of the puzzle, and the challenge extends just beyond a single chip. “The IC designers have to think about a lot of things concurrently, like thermal, signal integrity, and power integrity,” said Keith Lanier, technical product management director at Synopsys. “But even beyond that, there’s a new paradigm in terms of how people need to work. Traditional packaging folks and IC designers need to work closely together to make these 3.5D designs successful.”

It’s not just about doing more with the same or fewer people. It’s doing more with different people, too. “It’s understanding the architecture definition, the functional requirements, constraints, and having those well-defined,” Lanier said. “But then it’s also feasibility, which includes partitioning and technology selection, and then prototyping and floor-planning. This is lots and lots of data that is required to be generated, and you need analysis-driven exploration, design, and implementation. And AI will be required to help designers and system design teams manage the sheer complexity of these 3.5D designs.”

Process/assembly design kits are a second critical piece, and this is likely to be split between the foundries and the OSATs. “If the customer wants a silicon interposer for a 2.5D package, it would be up to the foundry that’s going to manufacture the interposer to provide the PDK. We would provide the PDK for all of our products, such as S-SWIFT and S-Connect,” said Amkor’s Fuentes.

Setting realistic parameters is the third piece of the puzzle. While the type of processing elements and some of the analog functions may change — particularly those involving power and communication — most of the components will remain the same. That determines what can be pre-built and pre-tested, and the speed and ease of assembly.

“A lot of the standards that are being deployed, like UCIe interfaces and HBM interfaces are heading to where 20% is customization and 80% is on the shelf,” said Intel’s O’Buckley. “But we aren’t there today. At the scale that our customers are deploying these products, the economics of spending that extra time to optimize an implementation is a decimal point. It’s not leveraging 80/20 standards. We’ll get there. But most of these designs you can count on your fingers and toes because of the cost and scale required to do them. And until the infrastructure for standards-based chiplets gets mature, the barrier of entry for companies that want to do this without that scale is just too high. Still, it is going to happen.”

Ensuring processes are consistent is the fourth piece of the puzzle. The tools and the individual processes don’t need to change. “The customer has a ‘target’ for the outcome they want for a particular tool, which typically is a critical dimension measured by a metrology tool,” said David Park, vice president of marketing at Tignis. “As long as there is some ‘measurement’ that determines the goodness of some outcome, which typically is the result of a process step, we can either predict the bad outcome — and engineers have to take some corrective or preventive action — or we can optimize the recipe of that tool in real time to keep the result in the range they want.”

Park noted there is a recipe that controls the inputs. “The tool does whatever it is supposed to do,” he said. “Then you measure the output to see how far you deviated from the acceptable output.”

The challenge is that inside of a 3.5D system, what is considered acceptable output is still being defined. There are many processes with different tolerances. Defining what is consistent enough will require a broad understanding of how all the pieces work together under specific workloads, and where the potential weaknesses are that need to be adjusted.

“One of the problems here is as these densities get higher and the copper pillars get smaller, the amount of space you need between the pillar and the substrate have to be highly controlled,” said Dick Otte, president and CEO of Promex. “There’s a conflict — not so much with how you fabricate the chip, because it usually has the copper pillars on it — but with the substrate. A lot of the substrate technologies are not inherently flat. It’s the same issue with glass. You’ve got a really nice flat piece of glass. The first thing you’re going to do is put down a layer of metal and you’re going to pattern it. And then you put down a layer of dielectric, and suddenly you’ve got a lump where the conductor goes. And now, where do you put the contact points? So you always have the one plan which is going to be the contact point where all the pillars come in. But what if I only need one layer and I don’t need three?”

Conclusion
For the past decade, the chip industry has been trying to figure out a way to balance faster processing, domain-specific designs, limited reticle size, and the enormous cost of scaling an SoC. After investigating nearly every possible packaging approach, interconnect, power delivery method, substrate and dielectric material, 3.5D has emerged as the front runner — at least for now.

This approach provides the chip industry with a common thread on which to begin developing assembly design kits, commercial chiplets, and to fill in the missing tools and services throughout the supply chain. Whether this ultimately becomes a springboard for full 3D-ICs, or a platform on which to use 3D stacking more effectively, remains to be seen. But for the foreseeable future, large chipmakers have converged on a path forward to provide orders of magnitude performance improvements and a way to contain costs. The rest of the industry will be working to smooth out that path for years to come.

Related Reading
Intel Vs. Samsung Vs. TSMC
Foundry competition heats up in three dimensions and with novel technologies as planar scaling benefits diminish.
3D Metrology Meets Its Match In 3D Chips And Packages
Next-generation tools take on precision challenges in three dimensions.
Design Flow Challenged By 3D-IC Process, Thermal Variation
Rethinking traditional workflows by shifting left can help solve persistent problems caused by process and thermal variations.
Floor-Planning Evolves Into The Chiplet Era
Automatically mitigating thermal issues becomes a top priority in heterogeneous designs.

The post 3.5D: The Great Compromise appeared first on Semiconductor Engineering.

Metrology And Inspection For The Chiplet Era

New developments and innovations in metrology and inspection will enable chipmakers to identify and address defects faster and with greater accuracy than ever before, all of which will be required at future process nodes and in densely-packed assemblies of chiplets.

These advances will affect both front-end and back-end processes, providing increased precision and efficiency, combined with artificial intelligence/machine learning and big data analytics. These kinds of improvements will be crucial for meeting the industry’s changing needs, enabling deeper insights and more accurate measurements at rates suitable for high-volume manufacturing. But gaps still need to be filled, and new ones are likely to show up as new nodes and processes are rolled out.

“As semiconductor devices become more complex, the demand for high-resolution, high-accuracy metrology tools increases,” says Brad Perkins, product line manager at Nordson Test & Inspection. “We need new tools and techniques that can keep up with shrinking geometries and more intricate designs.”

The shift to high-NA EUV lithography (0.55 NA EUV) at the 2nm node and beyond is expected to exacerbate stochastic variability, demanding more robust metrology solutions on the front end. Traditional critical dimension (CD) measurements alone are insufficient for the level of analysis required. Comprehensive metrics, including line-edge roughness (LER), line-width roughness (LWR), local edge-placement error (LEPE), and local CD uniformity (LCDU), alongside CD measurements, are necessary for ensuring the integrity and performance of advanced semiconductor devices. These metrics require sophisticated tools that can capture and analyze tiny variations at the nanometer scale, where even slight discrepancies can significantly impact device functionality and yield.

“Metrology is now at the forefront of yield, especially considering the current demands for DRAM and HBM,” says Hamed Sadeghian, president and CEO of Nearfield Instruments. “The next generations of HBMs are approaching a stage where hybrid bonding will be essential due to the increasing stack thickness. Hybrid bonding requires high resolutions in vertical directions to ensure all pads, and the surface height versus the dielectric, remain within nanometer-scale process windows. Consequently, the tools used must be one order of magnitude more precise.”

To address these challenges, companies are developing hybrid metrology systems that combine various measurement techniques for a comprehensive data set. Integrating scatterometry, electron microscopy, and/or atomic force microscopy allows for more thorough analysis of critical features. Moreover, AI and ML algorithms enhance the predictive capabilities of these tools, enabling process adjustments.

“Our customers who are pushing into more advanced technology nodes are desperate to understand what’s driving their yield,” says Ronald Chaffee, senior director of applications engineering at NI/Emerson Test & Measurement. “They may not know what all the issues are, but they are gathering all possible data — metrology, AEOI, and any measurable parameters — and seeking correlations.”

Traditional methods for defect detection, pattern recognition, and quality control typically used spatial pattern-recognition modules and wafer image-based algorithms to address wafer-level issues. “However, we need to advance beyond these techniques,” says Prasad Bachiraju, senior director of business development at Onto Innovation. “Our observations show that about 20% of wafers have systematic issues that can limit yield, with nearly 4% being new additions. There is a pressing need for advanced metrology for in-line monitoring to achieve zero-defect manufacturing.”

Several companies recently announced metrology innovations to provide more precise inspections, particularly for difficult-to-see areas, edge effects, and highly reflective surfaces.

Nordson unveiled its AMI SpinSAM acoustic rotary scan system. The system represents a significant departure from traditional raster scan methods, utilizing a rotational scanning approach. Rather than moving the wafer in an x,y pattern relative to a stationary lens, the wafer spins, similar to a record player. This reduces motion over the wafer and increases inspection speed, negating the need for image stitching and improving image quality.

“For years, we’d been trying to figure out this technique, and it’s gratifying to finally achieve it. It’s something we’ve always thought would be incredibly beneficial,” says Perkins. “The SpinSAM is designed primarily to enhance inspection speed and efficiency, addressing the common industry demand for more product throughput and better edge inspection capabilities.”

Meanwhile, Nearfield Instruments introduced a multi-head atomic force microscopy (AFM) system called QUADRA. It is a high-throughput, non-destructive metrology tool for HVM that features a novel multi-miniaturized AFM head architecture. Nearfield claims the parallel independent multi-head scanner can deliver a 100-fold throughput advantage versus conventional single-probe AFM tools. This architecture allows for precise measurements of high-aspect-ratio structures and complex 3D features, critical for advanced memory (3D NAND, DRAM, HBM) and logic processes.


Fig. 1: Image capture comparison of standard AFM and multi-head AFM. Source: Nearfield Instruments

In April, Onto Innovation debuted an advancement in subsurface defect inspection technology with the release of its Dragonfly G3 inspection system. The new system allows for 100% wafer inspection, targeting subsurface defects that can cause yield losses, such as micro-cracks and other hidden flaws that may lead to entire wafers breaking during subsequent processing steps. The Dragonfly G3 utilizes novel infrared (IR) technology combined with specially designed algorithms to detect these defects, which previously were undetectable in a production environment. This new capability supports HBM, advanced logic, and various specialty segments, and aims to improve final yield and cost savings by reducing scrapped wafers and die stacks.

More recently, researchers at the Paul Scherrer Institute announced a high-performance X-ray tomography technique using burst ptychography. This new method can provide non-destructive, detailed views of nanostructures as small as 4nm in materials like silicon and metals at a fast acquisition rate of 14,000 resolution elements per seconds. The tomographic back-propagation reconstruction allows imaging of samples up to ten times larger than the conventional depth of field.

There are other technologies and techniques for improving metrology in semiconductor manufacturing, as well, including wafer-level ultrasonic inspection, which involves flipping the wafer to inspect from the other side. New acoustic microscopy techniques, such as scanning acoustic microscopy (SAM) and time-of-flight acoustic microscopy (TOF-AM), enable the detection and characterization of very small defects, such as voids, delaminations, and cracks within thin films and interfaces.

“We used to look at 80 to 100 micron resist films, but with 3D integrated packaging, we’re now dealing with films that are 160 to 240 microns—very thick resist films,” says Christopher Claypool, senior application scientist at Bruker OCD. “In TSVs and microbumps, the dominant technique today is white light interferometry, which provides profile information. While it has some advantages, its throughput is slow, and it’s a focus-based technique. This limitation makes it difficult to measure TSV structures smaller than four or five microns in diameter.”

Acoustic metrology tools equipped with the newest generation of focal length transducers (FLTs) can focus acoustic waves with precision down to a few nanometers, allowing for non-destructive detailed inspection of edge defects and critical stress points. This capability is particularly useful for identifying small-scale defects that might be missed by other inspection methods.

The development and integration of smart sensors in metrology equipment is instrumental in collecting the vast amounts of data needed for precise measurement and quality control. These sensors are highly sensitive and capable of operating under various environmental conditions, ensuring consistent performance. One significant advantage of smart sensors is their ability to facilitate predictive maintenance. By continuously monitoring the health and performance of metrology equipment, these sensors can predict potential failures and schedule maintenance before significant downtime occurs. This capability enhances the reliability of the equipment, reduces maintenance costs, and improves overall operational efficiency.

Smart sensors also are being developed to integrate seamlessly with metrology systems, offering real-time data collection and analysis. These sensors can monitor various parameters throughout the manufacturing process, providing continuous feedback and enabling quick adjustments to prevent defects. Smart sensors, combined with big data platforms and advanced data analytics, allow for more efficient and accurate defect detection and classification.

Critical stress points

A persistent challenge in semiconductor metrology is the identification and inspection of defects at critical stress points, particularly at the silicon edges. For bonded wafers, it’s at the outer ring of the wafer. For chip-on-wafer packaging, it’s at the edge of the chips. These edge defects are particularly problematic because they occur at the highest stress points from the neutral axis, making them more prone to failures. As semiconductor devices continue to involve more intricate packaging techniques, such as chip-on-wafer and wafer-level packaging, the focus on edge inspection becomes even more critical.

“When defects happen in a factory, you need imaging that can detect and classify them,” says Onto’s Bachiraju. “Then you need to find the root causes of where they’re coming from, and for that you need the entire data integration and a big data platform to help with faster analysis.”

Another significant challenge in semiconductor metrology is ensuring the reliability of known good die (KGD), especially as advanced packaging techniques and chiplets become more prevalent. Ensuring that every chip/chiplet in a stacked die configuration is of high quality is essential for maintaining yield and performance, but the speed of metrology processes is a constant concern. This leads to a balancing act between thoroughness and efficiency. The industry continuously seeks to develop faster machines that can handle the increasing volume and complexity of inspections without compromising accuracy. In this race, innovations in data processing and analysis are key to achieving quicker results.

“Customers would like, generally, 100% inspection for a lot of those processes because of the known good die, but it’s cost-prohibitive because the machines just can’t run fast enough,” says Nordson’s Perkins.

Metrology and Industry 4.0

Industry 4.0 — a term introduced in Germany in 2011 for the fourth industrial revolution, and called smart manufacturing in the U.S. — emphasizes the integration of digital technologies such as the Internet of Things, artificial intelligence, and big data analytics into manufacturing processes. Unlike past revolutions driven by mechanization, electrification, and computerization, Industry 4.0 focuses on connectivity, data, and automation to enhance manufacturing capabilities and efficiency.

“The better the data integration is, the more efficient the yield ramp,” says Dieter Rathei, CEO of DR Yield. “It’s essential to integrate all available data into the system for effective monitoring and analysis.”

In semiconductor manufacturing, this shift toward Industry 4.0 is particularly transformative, driven by the increasing complexity of semiconductor devices and the demand for higher precision and yield. Traditional metrology methods, heavily reliant on manual processes and limited automation, are evolving into highly interconnected systems that enable real-time data sharing and decision-making across the entire production chain.

“There haven’t been many tools to consolidate different data types into a single platform,” says NI’s Chaffee. “Historically, yield management systems focused on testing, while FDC or process systems concentrated on the process itself, without correlating the two. As manufacturers push into the 5, 3, and 2nm spaces, they’re discovering that defect density alone isn’t the sole governing factor. Process control is also crucial. By integrating all data, even the most complex correlations that a human might miss can be identified by AI and ML. The goal is to use machine learning to detect patterns or connections that could help control and optimize the manufacturing process.”

IoT forms the backbone of Industry 4.0 by connecting various devices, sensors, and systems within the manufacturing environment. In semiconductor manufacturing, IoT enables seamless communication between metrology tools, production equipment, and factory management systems. This interconnected network facilitates real-time monitoring and control of manufacturing processes, allowing for immediate adjustments and optimization.

“You need to integrate information from various sources, including sensors, metrology tools, and test structures, to build predictive models that enhance process control and yield improvement,” says Michael Yu, vice president of advanced solutions at PDF Solutions. “This holistic approach allows you to identify patterns and correlations that were previously undetectable.”

AI and ML are pivotal in processing and analyzing the vast amounts of data generated in a smart factory. These technologies can identify patterns, predict equipment failures, and optimize process parameters with a level of precision and speed unattainable by human operators alone. In semiconductor manufacturing, AI-driven analytics enhance process control, improve yield rates, and reduce downtime. “One of the major trends we see is the integration of artificial intelligence and machine learning into metrology tools,” says Perkins. “This helps in making sense of the vast amounts of data generated and enables more accurate and efficient measurements.”

AI’s role extends further as it assists in discovering anomalies within the production process that might have gone unnoticed with traditional methods. AI algorithms integrated into metrology systems can dynamically adjust processes in real-time, ensuring that deviations are corrected before they affect the end yield. This incorporation of AI minimizes defect rates and enhances overall production quality.

“Our experience has shown that in the past 20 years, machine learning and AI algorithms have been critical for automatic data classification and die classification,” says Bachiraju. “This has significantly improved the efficiency and accuracy of our metrology tools.”

Big data analytics complements AI/ML by providing the infrastructure necessary to handle and interpret massive datasets. In semiconductor manufacturing, big data analytics enables the extraction of actionable insights from data generated by IoT devices and production systems. This capability is crucial for predictive maintenance, quality control, and continuous process improvement.

“With big data, we can identify patterns and correlations that were previously impossible to detect, leading to better process control and yield improvement,” says Perkins.

Big data analytics also helps in understanding the lifecycle of semiconductor devices from production to field deployment. By analyzing product performance data over time, manufacturers can predict potential failures and enhance product designs, increasing reliability and lifecycle management.

“In the next decade, we see a lot of opportunities for AI,” says DR Yield’s Rathei. “The foundation for these advancements is the availability of comprehensive data. AI models need extensive data for training. Once all the data is available, we can experiment with different models and ideas. The ingenuity of engineers, combined with new tools, will drive exponential progress in this field.”

Metrology gaps remain

Despite recent advancements in metrology, analytics, and AI/ML, several gaps still remain, particularly in the context of high-volume manufacturing (HVM) and next-generation devices. The U.S. Commerce Department’s CHIPS R&D Metrology Program, along with industry stakeholders, have highlighted seven “grand challenges,” areas where current metrology capabilities fall short:

Metrology for materials purity and properties: There is a critical need for new measurements and standards to ensure the purity and physical properties of materials used in semiconductor manufacturing. Current techniques lack the sensitivity and throughput required to detect particles and contaminants throughout the supply chain.

Advanced metrology for future manufacturing: Next-generation semiconductor devices, such as gate-all-around (GAA) FETs and complementary FETs (CFETs), require breakthroughs in both physical and computational metrology. Existing tools are not yet capable of providing the resolution, sensitivity, and accuracy needed to characterize the intricate features and complex structures of these devices. This includes non-destructive techniques for characterizing defects and impurities at the nanoscale.

“There is a secondary challenge with some of the equipment in metrology, which often involves sampling data from single points on a wafer, much like heat test data that only covers specific sites,” says Chaffee. “To be meaningful, we need to move beyond sampling methods and find creative ways to gather information from every wafer, integrating it into a model. This involves building a knowledge base that can help in detecting patterns and correlations, which humans alone might miss. The key is to leverage AI and machine learning to identify these correlations and make sense of them, especially as we push into the 5, 3, and 2nm spaces. This process is iterative and requires a holistic approach, encompassing various data points and correlating them to understand the physical boundaries and the impact on the final product.”

Metrology for advanced packaging: The integration of sophisticated components and novel materials in advanced packaging technologies presents significant metrology challenges. There is a need for rapid, in-situ measurements to verify interfaces, subsurface interconnects, and internal 3D structures. Current methods do not adequately address issues such as warpage, voids, substrate yield, and adhesion, which are critical for the reliability and performance of advanced packages.

Modeling and simulating semiconductor materials, designs, and components: Modeling and simulating semiconductor processes require advanced computational models and data analysis tools. Current capabilities are limited in their ability to seamlessly integrate the entire semiconductor value chain, from materials inputs to system assembly. There is a need for standards and validation tools to support digital twins and other advanced simulation techniques that can optimize process development and control.

“Predictive analytics is particularly important,” says Chaffee. “They aim to determine the probability of any given die on a wafer being the best yielding or presenting issues. By integrating various data points and running different scenarios, they can identify and understand how specific equipment combinations, sequences and processes enhance yields.”

Modeling and simulating semiconductor processes: Current capabilities are limited in their ability to seamlessly integrate the entire semiconductor value chain, from materials inputs to system assembly. There is a need for standards and validation tools to support digital twins and other advanced simulation techniques that can optimize process development and control.

“Part of the problem comes from the back-end packaging and assembly process, but another part of the problem can originate from the quality of the wafer itself, which is determined during the front-end process,” says PDF’s Yu. “An effective ML model needs to incorporate both front-end and back-end information, including data from equipment sensors, metrology, and structured test information, to make accurate predictions and take proactive actions to correct the process.”

Standardizing new materials and processes: The development of future information and communication technologies hinges on the creation of new standards and validation methods. Current reference materials and calibration services do not meet the requirements for next-generation materials and processes, such as those used in advanced packaging and heterogeneous integration. This gap hampers the industry’s ability to innovate and maintain competitive production capabilities.

Metrology to enhance security and provenance of components and products: With the increasing complexity of the semiconductor supply chain, there is a need for metrology solutions that can ensure the security and provenance of components and products. This involves developing methods to trace materials and processes throughout the manufacturing lifecycle to prevent counterfeiting and ensure compliance with regulatory standards.

“The focus on security and sharing changes the supplier relationship into more of a partnership and less of a confrontation,” says Chaffee. “Historically, there’s always been a concern of data flowing across that boundary. People are very protective about their process, and other people are very protective about their product. But once you start pushing into the deep sub-micron space, those barriers have to come down. The die are too expensive for them not to communicate, but they can still do so while protecting their IP. Companies are starting to realize that by sharing parametric test information securely, they can achieve better yield management and process optimization without compromising their intellectual property.”

Conclusion

Advancements in metrology and testing are pivotal for the semiconductor industry’s continued growth and innovation. The integration of AI/ML, IoT, and big data analytics is transforming how manufacturers approach process control and yield improvement. As adoption of Industry 4.0 grows, the role of metrology will become even more critical in ensuring the efficiency, quality, and reliability of semiconductor devices. And by leveraging these advanced technologies, semiconductor manufacturers can achieve higher yields, reduce costs, and maintain the precision required in this competitive industry.

With continuous improvements and the integration of smart technologies, the semiconductor industry will keep pushing the boundaries of innovation, leading to more robust and capable electronic devices that define the future of technology. The journey toward a fully realized Industry 4.0 is ongoing, and its impact on semiconductor manufacturing undoubtedly will shape the future of the industry, ensuring it stays at the forefront of global technological advancements.

“Anytime you have new packaging technologies and process technologies that are evolving, you have a need for metrology,” says Perkins. “When you are ramping up new processes and need to make continuous improvements for yield, that is when you see the biggest need for new metrology solutions.”

The post Metrology And Inspection For The Chiplet Era appeared first on Semiconductor Engineering.

Leveraging AI To Efficiently Test AI Chips

Od: Advantest

In the fast-paced world of technology, where innovation and efficiency are paramount, integrating artificial intelligence (AI) and machine learning (ML) into the semiconductor testing ecosystem has become of critical importance due to ongoing challenges with accuracy and reliability. AI and ML algorithms are used to identify patterns and anomalies that might not be discovered by human testers or traditional methods. By leveraging these technologies, companies can achieve higher accuracy in defect detection, ensuring that only the highest quality semiconductors reach the market. In addition, the industry is clamoring for increased efficiency and speed because AI-driven testing can significantly accelerate the testing process, analyzing vast amounts of data at speeds unattainable by human testers. This enables quicker turnaround times from design to production, helping companies meet market demands more effectively and stay ahead of competitors. Firms are also heavily invested in reducing costs. While the initial investment in AI/ML technology can be expansive, the long-term savings are irrefutable. With automated routine and complex testing processes, companies can reduce labor costs and minimize human error. Equally important, AI-enhanced testing can better predict potential failures before they occur, saving costs related to recalls and repairs.

The industry is now moving to chiplet-based modules, using a “Lego-like” approach to integrate CPU, GPU, cache, I/O, high-bandwidth memory (HBM), and other functions. In the rapidly evolving world of chiplets, the DUT is a complex multichip system with the integration of many devices in a single 2.5D or 3D package. Consequently, the tester can only access a subset of individual device pins. Even so, at each test insertion, the tester must be able to extract valuable data that is then used to optimize the current test insertion as well as other design, manufacturing, and test steps. With limited pin access, the tester must infer what is happening on unobservable nodes. To best achieve this goal, it is important to extract the most value possible out of the data that can be directly collected across all manufacturing and test steps, including data from on-chip sensors. The test flow in the chiplet world already includes PSV, wafer acceptance test (WAT), wafer sort (WS), final test (FT), burn-in, and SLT, and additional test insertions to account for the increased complexity of a package with multiple chiplets are not feasible from a cost perspective. Adding to the challenge, binning goes from performance-based to application-based. In this world, the tester must stay ahead of the system – the tester must be smarter than the complex system-under-test.

The ACS RTDI platform accelerates data analytics and AI/ML decision-making.

So, for these reasons and many more, the adoption of edge compute for ML test applications is well underway. Advantest’s ACS Real-Time Data Infrastructure (ACS RTDI) platform accelerates data analytics and AI/ML decision-making within a single integrated platform. It collects, analyzes, stores, and monitors semiconductor test data as well as data sources across the IC manufacturing supply chain while employing low-latency edge computing and analytics in a secure zero-trust environment. ACS RTDI minimizes the need for human intervention, streamlining overall data utilization across multiple insertions to boost quality, yield, and operational efficiencies. It includes Advantest’s ACS Edge HPC server, which works in conjunction with its V93000 and other ATE systems to handle computationally intensive workloads adjacent to the tester’s host controller.

A reliable, secure real-time data structure that integrates data sources across the IC manufacturing supply chain.

In this configuration, the ACS Edge provides low, consistent, and predictable latency compared with a data center-hosted alternative. It supports a user execution environment independent of the tester host controller to ease development and deployment. It also provides a reliable and secure real-time data infrastructure that integrates all data sources across the entire IC manufacturing supply chain, applying analytics models that enable real-time decision-making during production test.

The post Leveraging AI To Efficiently Test AI Chips appeared first on Semiconductor Engineering.

Keeping Up With New ADAS And IVI SoC Trends

Od: Hezi Saar

In the automotive industry, AI-enabled automotive devices and systems are dramatically transforming the way SoCs are designed, making high-quality and reliable die-to-die and chip-to-chip connectivity non-negotiable. This article explains how interface IP for die-to-die connectivity, display, and storage can support new developments in automotive SoCs for the most advanced innovations such as centralized zonal architecture and integrated ADAS and IVI applications.

AI-integrated ADAS SoCs

The automotive industry is adopting a new electronic/electric (EE) architecture where a centralized compute module executes multiple applications such as ADAS and in-vehicle infotainment (IVI). With the advent of EVs and more advanced features in the car, the new centralized zonal architecture will help minimize complexity, maximize scalability, and facilitate faster decision-making time. This new architecture is demanding a new set of SoCs on advanced process technologies with very high performance. More traditional monolithic SoCs for single functions like ADAS are giving way to multi-die designs where various dies are connected in a single package and placed in a system to perform a function in the car. While such multi-die designs are gaining adoption, semiconductor companies must remain cost-conscious as these ADAS SoCs will be manufactured at high volumes for a myriad of safety levels. One example is the automated driving central compute system. The system can include modules for the sensor interface, safety management, memory control and interfaces, and dies for CPU, GPU, and AI accelerator, which are then connected via a die-to-die interface such as the Universal Chiplet Interconnect Express (UCIe). Figure 1 illustrates how semiconductor companies can develop SoCs for such systems using multi-die designs. For a base ADAS or IVI SoC, the requirement might just be the CPU die for a level 2 functional safety. A GPU die can be added to the base CPU die for a base ADAS or premium IVI function at a level 2+ driving automation. To allow more compute power for AI workloads, an NPU die can be added to the base CPU or the base CPU and GPU dies for level 3/3+ functional safety. None of these scalable scenarios are possible without a solution for die-to-die connectivity.

Fig. 1: A simplified view of automotive systems using multi-die designs.

The adoption of UCIe for automotive SoCs

The industry has come together to define, develop, and deploy the UCIe standard, a universal interconnect at the package-level. In a recent news release, the UCIe Consortium announced “updates to the standard with additional enhancements for automotive usages – such as predictive failure analysis and health monitoring – and enabling lower-cost packaging implementations.” Figure 2 shows three use cases for UCIe. The first use case is for low-latency and coherency where two Network on a Chip (NoC) are connected via UCIe. This use case is mainly for applications requiring ADAS computing power. The second automotive use case is when memory and IO are split into two separate dies and are then connected to the compute die via CXL and UCIe streaming protocols. The third automotive use case is very similar to what is seen in HPC applications where a companion AI accelerator die is connected to the main CPU die via UCIe.

Fig. 2: Examples of common and new use cases for UCIe in automotive applications.

To enable such automotive use cases, UCIe offers several advantages, all of which are supported by the Synopsys UCIe IP:

  • Latency optimized architecture: Flit-Aware Die-to-Die Interface (FDI) or Raw Die-to-Die Interface (RDI) operate with local 2GHz system clock. Transmitter and receiver FIFOs accommodate phase mismatch between clock domains. There is no clock domain crossing (CDC) between the PHY and Adapter layers for minimum latency. The reference clock has the same frequency for the two dies.
  • Power-optimized architecture: The transmitter provides the CMOS driver without source termination. IT offers programmable drive strength without a Feed-Forward Equalizer (FFE). The receiver provides a continuous-time linear equalizer (CTLE) without VGA and decision feedback equalizer (DFE), clock forwarding without Clock and Data Recovery (CDR), and optional receiver termination.
  • Reliability and test: Signal integrity monitors track the performance of the interconnect through the chip’s lifecycle. This can monitor inaccessible paths in the multi-die package, test and repair the PHY, and execute real time reporting for preventative maintenance.

Synopsys UCIe IP is integrated with Synopsys 3DIC Compiler, a unified exploration-to-signoff platform. The combination eases package design and provides a complete set of IP deliverables, automated UCIe routing for better quality of results, and reference interposer design for faster integration.

Fig. 3: Synopsys 3DIC Compiler.

New automotive SoC design trends for IVI applications

OEMs are attracting consumers by providing the utmost in cockpit experience with high-resolution, 4K, pillar-to-pillar displays. Multi-Stream Transport (MTR) enables a daisy-chained display topology using a single port, which consists of a single GPU, one DP TX controller, and PHY, to display images on multiple screens in the car. This revision clarifies the components involved and maintains the original meaning. This daisy-chained set up simplifies the display wiring in the car. Figure 4 illustrates how connectivity in the SoC can enable multi-display environments in the car. Row 1: Multiple image sources from the application processor are fed into the daisy-chained display set up via the DisplayPort (DP) MTR interface. Row 2: Multiple image sources from the application processor are fed to the daisy-chained display set up but also to the left or right mirrors, all via the DP MTR interface. Row 3: The same set up in row 2 can be executed via the MIPI DSI or embedded DP MTR interfaces, depending on display size and power requirements.

An alternate use case is USB/DP. A single USB port can be used for silicon lifecycle management, sentry mode, test, debug, and firmware download. USB can be used to avoid the need for very large numbers of test pings, speed up test by exceeding GPIO test pin data rates, repeat manufacturing test in-system and in-field, access PVT monitors, and debug.

Fig. 4: Examples of display connectivity in software-defined vehicles.

ISO/SAE 21434 automotive cybersecurity

ISO/SAE 21434 Automotive Cybersecurity is being adopted by industry leaders as mandated by the UNECE R155 regulation. Starting in July 2024, automotive OEMs must comply with the UNECE R155 automotive cybersecurity regulation for all new vehicles in Europe, Japan, and Korea.

Automotive suppliers must develop processes that meet the automotive cybersecurity requirements of ISO/SAE 21434, addressing the cybersecurity perspective in the engineering of electrical and electronic (E/E) systems. Adopting this methodology involves embracing a cybersecurity culture which includes developing security plans, setting security goals, conducting engineering reviews and implementing mitigation strategies.

The industry is expected to move towards enabling cybersecurity risk-managed products to mitigate the risks associated with advancement in connectivity for software-defined vehicles. As a result, automotive IP needs to be ready to support these advancements.

Synopsys ARC HS4xFS Processor IP has achieved ISO/SAE 21434 cybersecurity certification by SGS-TṺV Saar, meeting stringent automotive regulatory requirements designed to protect connected vehicles from malicious cyberattacks. In addition, Synopsys has achieved certification of its IP development process to the ISO/SAE 21434 standard to help ensure its IP products are developed with a security-first mindset through every phase of the product development lifecycle.

Conclusion

The transformation to software-defined vehicles marks a significant shift in the automotive industry, bringing together highly integrated systems and AI to create safer and more efficient vehicles while addressing sophisticated user needs and vendor serviceability. New trends in the automotive industry are presenting opportunities for innovations in ADAS and IVI SoC designs. Centralized zonal architecture, multi-die design, daisy-chained displays, and integration of ADAS/IVI functions in a single SoC are among some of the key trends that the automotive industry is tracking. Synopsys is at the forefront of automotive SoC innovations with a portfolio of silicon-proven automotive IP for the highest levels of functional safety, security, quality, and reliability. The IP portfolio is developed and assessed specifically for ISO 26262 random hardware faults and ASIL D systematic. To minimize cybersecurity risks, Synopsys is developing IP products as per the ISO/SAE 21434 standard to provide automotive SoC developers a safe, reliable, and future proof solution.

The post Keeping Up With New ADAS And IVI SoC Trends appeared first on Semiconductor Engineering.

Adoption of Chiplet Technology in the Automotive Industry

A technical paper titled “Chiplets on Wheels: Review Paper on Holistic Chiplet Solutions for Autonomous Vehicles” was published by researchers at the Indian Institute of Technology, Madras.

Abstract
“On the advent of the slow death of Moore’s law, the silicon industry is moving towards a new era of chiplets. The automotive industry is experiencing a profound transformation towards software-defined vehicles, fueled by the surging demand for automotive compute chips, expected to reach 20-22 billion by 2030. High-performance compute (HPC) chips become instrumental in meeting the soaring demand for computational power. Various strategies, including centralized electrical and electronic architecture and the innovative Chiplet Systems, are under exploration. The latter, breaking down System-on-Chips (SoCs) into functional units, offers unparalleled customization and integration possibilities. The research accentuates the crucial open Chiplet ecosystem, fostering collaboration and enhancing supply chain resilience. In this paper, we address the unique challenges that arise when attempting to leverage chiplet-based architecture to design a holistic silicon solution for the automotive industry. We propose a throughput-oriented micro-architecture for ADAS and infotainment systems alongside a novel methodology to evaluate chiplet architectures. Further, we develop in-house simulation tools leveraging the gem5 framework to simulate latency and throughput. Finally, we perform an extensive design of thermally-aware chiplet placement and develop a micro-fluids-based cooling design.”

Find the technical paper here. Published May 2024.

Narashiman, Swathi, Divyaratna Joshi, Deepak Sridhar, Harish Rajesh, Sanjay Sattva, and Varun Manjunath. “Chiplets on Wheels: Review Paper on Holistic Chiplet Solutions for Autonomous Vehicles.” arXiv preprint arXiv:2406.00182 (2024).

The post Adoption of Chiplet Technology in the Automotive Industry appeared first on Semiconductor Engineering.

Making Adaptive Test Work Better

One of the big challenges for IC test is making sense of mountains of data, a direct result of more features being packed onto a single die, or multiple chiplets being assembled into an advanced package. Collecting all that data through various agents and building models on the tester no longer makes sense for a couple reasons — there is too much data, and there are multiple customers using the same equipment. Steve Zamek, director of product management at PDF Solutions, and Eli Roth, product manager at Teradyne, explain how to optimize testing around different data sources, how to partition that data between the edge and the cloud, and how to ensure it remains secure.

The post Making Adaptive Test Work Better appeared first on Semiconductor Engineering.

Chip Industry Week In Review

JEDEC and the Open Compute Project rolled out a new set of guidelines for standardizing chiplet characterization details, such as thermal properties, physical and mechanical requirements, and behavior specs. Those details have been a sticking point for commercial chiplets, because without them it’s not possible to choose the best chiplet for a particular application or workload. The guidelines are a prerequisite for a multi-vendor chiplet marketplace.

AMD, Broadcom, Cisco, Google, HPE, Intel, Meta, and Microsoft proposed a new high-speed, low-latency interconnect specification, Ultra Accelerator Link (UALink), between accelerators and switches in AI computing pods. The 1.0 specification will enable the connection of up to 1,024 accelerators within a pod and allow for direct loads and stores between the memory attached to accelerators.

Arm debuted a range of new CPUs, including the Cortex-X925 for on-device generative AI, and the Cortex-A725 with improved efficiency for AI and mobile gaming. It also announced the Immortalis-G925 GPU for flagship smartphones, and the Mali-G725/625 GPUs for consumer devices. Additionally, Arm announced Compute Subsystems (CSS) for Client to provide foundational computing elements for AI smartphone and PC SoCs, and it introduced KleidiAI, a set of compute kernels for developers of AI frameworks. The Armv9-A architecture also added support for the Scalable Matrix Extension to accelerate AI workloads.

TSMC said its 2nm process is on target to begin mass production in 2025. Meanwhile, Samsung is expected to release its 1nm plan next month, targeting mass production for 2026 — a year ahead of schedule, reports Business Korea.

CHIPs for America and NATCAST released a 2024 roadmap for the U.S. National Semiconductor Technology Center (NSTC), identifying priorities for facilities, research, workforce development, and membership.

China is investing CNY 344 billion (~$47.5 billion) into the third phase of its National Integrated Circuit Industry Investment Fund, also known as the Big Fund, to support its semiconductor sector and supply chain, according to numerous reports.

Malaysia plans to invest $5.3 billion in seed capital and support for semiconductor manufacturing in an effort to attract more than $100 billion in foreign investments, reports Reuters. Prime Minister Anwar Ibrahim announced the effort to create at least 10 companies focused on IC design, advanced packaging, and equipment manufacturing.

imec demonstrated a die-to-wafer hybrid bonding flow for Cu-Cu and SiCN-SiCN at pitches down to 2µm at the IEEE’s ECTC conference. This breakthrough could enable die and wafer-level optical interconnects.

The chip industry is racing to develop glass for advanced packaging, setting the stage for one of the biggest shifts in chip materials in decades — and one that will introduce a broad new set of challenges that will take years to fully resolve.

Quick links to more news:

In-Depth
Global
Product News
Markets and Money
Security
Research and Training
Quantum
Events and Further Reading


In-Depth

Semiconductor Engineering published its Systems & Design newsletter featuring these top stories:


Global

STMicroelectronics is building a fully integrated SiC facility in Catania, Italy.  The high-volume 200mm facility is projected to cost over $5 billion.

Siliconware Precision Industries Co. Ltd.(SPIL) broke ground on an RM 6 billion (~$1.3 billion) advanced packaging and testing facility in Malaysia. Also, Google will invest $2 billion in Malaysia for its first data center, and a Google Cloud hub to meet growing demand for cloud services and AI literacy programs, reports AP.

In an SEC filing, Applied Materials received additional subpoenas from the U.S. Department of Commerce’s (DoC) Bureau of Industry and Security related to shipments of advanced semiconductor equipment to China. This comes on the heels of similar subpoenas issued last year.

A Chinese contractor working for SK hynix was arrested in South Korea and is being charged with funneling more than 3,000 copies of a paper on solving process failure issues to Huawei, reports South Korea’s Union News.

VSORA, CEA-Grenoble, and Valeo were awarded $7 million from the French government to build low-latency, low-power AI inference co-processors for autonomous driving and other applications.

In the U.S., the National Highway Traffic Safety Administration (NHTSA) is investigating unexpected driving behaviors of vehicles equipped with Waymo‘s 5th Generation automated driving system (ADS), with details of nine new incidents on top of the first 22.


Product News

ASE introduced powerSIP, a power delivery platform designed to reduce signal and transmission loss while addressing current density challenges.

Infineon announced a roadmap for energy-efficient power supply units based on Si, SiC, and GaN to address the energy needs of AI data centers, featuring new 8 kW and 12 kW PSUs, in addition to the 3 kW and 3.3 kW units available today. The company also released its CoolSiC MOSFET 400 V family, specially developed for use in the AC/DC stage of AI servers, complementing the PSU roadmap.

Fig. 1: Infineon’s 8kW PSU. Source: Infineon

Infineon also introduced two new generations of high voltage (HV) and medium voltage (MV) CoolGaN TM devices, enabling customers to use GaN in voltage classes from 40 V to 700 V. The devices are built using Infineon’s 8-inch foundry processes.

Ansys launched Ansys Access on Microsoft Azure to provide pre-configured simulation products optimized for HPC on Azure infrastructure.

Foxconn Industrial Internet used Keysight Technology’s Open RAN Studio solution to certify an outdoor Open Radio Unit (O-RU).

Andes Technology announced an SoC and development board for the development and porting of large RISC-V applications.

MediaTek uncorked a pair of mobile chipsets built on a 4nm process that use an octa-core CPU consisting of 4X Arm Cortex-A78 cores operating at up to 2.5GHz paired with 4X Arm Cortex-A55 cores.

The NVIDIA H200 Blackwell platform is expected to begin shipping in Q3 of 2024 and will be available to data centers by Q4, according to TrendForce.

A room-temperature direct fusion hybrid bonding system from Be Semiconductor has shipped to the NHanced advanced packaging facility in North Carolina. The new system offers faster throughput for copper interconnects with submicron pad sizes, greater accuracy and reduced warpage.


Markets and Money

Frore Systems raised $80 million for its solid-state active cooling module, which removes heat from the top of a chip without fans. The device in systems ranging from notebooks and network edge gateways to data centers.

Axus Technology received $12.5 million in capital equity funding to make its chemical mechanical planarization (CMP) equipment for semiconductor wafer polishing, thinning, and cleaning, including of silicon carbide (SiC) wafers.

Elon Musk’s xAI announced a series B funding round of $6 billion.

Micron was ordered to pay $445 million in damages to Netlist for patent infringement of the company’s DDR4 memory module technology between 2021 and 2024.

Global revenue from AI semiconductors is predicted to total $71 billion in 2024, up 33% from 2023, according to Gartner. In 2025, it is expected to jump to $91.9 billion. The value of AI accelerators used in servers is expected to total $21 billion in 2024 and reach $33 billion by 2028.

NAND flash revenue was $14.71 billion in Q1 2024, an increase of 28.1%, according to TrendForce.

The optical transceiver market dipped from $11 billion in 2022 to $10.9 billion in 2023, but it is predicted to reach $22.4 billion by 2029, driven by AI, 800G applications, and the transition to 200G/lane ecosystem technologies, reports Yole.

Yole also found that ultra-wideband technical choices and packaging types used by NXP, Apple, and Qorvo vary considerably, ranging from 7nm to 90nm, with both CMOS and finFET transistors.

The global market share of GenAI-capable smartphones increased to 6% in Q1 2024 from 1.3% in the previous quarter, reports Counterpoint. The premium segment accounted for over 70% of sales with Samsung on top and contributing 58%. Meanwhile, global foldable smartphone shipments were up 49% YoY in Q1 2024, led by Huawei, HONOR, and Motorola.


Security

The National Science Foundation awarded Worcester Polytechnic Institute researcher Shahin Tajik almost $0.6 million to develop new technologies to address hardware security vulnerabilities.

The Hyperform consortium was formed to develop European sovereignty in post-quantum cryptography, funded by the French government and EU credits. Members include IDEMIA Secure Transactions, CEA Leti, and the French cybersecurity agency (ANSSI).

In security research:

  • University of California Davis and University of Arizona researchers proposed a framework leveraging generative pre-trained transformer (GPT) models to automate the obfuscation process.
  • Columbia University and Intel researchers presented a secure digital low dropout regulator that integrates an attack detector and a detection-driven protection scheme to mitigate correlation power analysis.
  • Pohang University of Science and Technology (POSTECH) researchers analyzed threshold switch devices and their performance in hardware security.

The U.S. Defense Advanced Research Projects Agency (DARPA) seeks proposals for its AI Quantified program to develop technology to help deploy generative AI safely and effectively across the Department of Defense (DoD) and society.

Vanderbilt University and Oak Ridge National Laboratory (ORNL) partnered to develop dependable AI for national security applications.

The Cybersecurity and Infrastructure Security Agency (CISA) issued a number of alerts/advisories.


Research and Training

New York continues to amp up their semiconductor offerings. NY CREATES and Raytheon unveiled a semiconductor workforce training program. And Syracuse  University is hosting a free virtual course focused on the semiconductor industry this summer.

In research news:

  • A team of researchers at MIT and other universities found that extreme temperatures up to 500°C did not significantly degrade GaN materials or contacts.
  • University of Cambridge researchers developed adaptive and eco-friendly sensors that can be directly and imperceptibly printed onto biological surfaces, such as a finger or flower petal.
  • Researchers at Rice University and Hanyang University developed an elastic material that moves like skin and can adjust its dielectric frequency to stabilize RF communications and counter disruptive frequency shifts that interfere with electronics when a substrate is twisted or stretched, with potential for stretchable wearable electronic devices.

The National Science Foundation (NSF) awarded $36 million to three projects chosen for their potential to revolutionize computing. The University of Texas at Austin-led project aims to create a next-gen open-source intelligent and adaptive OS. The Harvard University-led project targets sustainable computing. The University of Massachusetts Amherst-led project will develop computational decarbonization.


Quantum

Singapore will invest close to S$300 million (~$222 million) into its National Quantum Strategy to support the development and deployment of quantum technologies, including an initiative to design and build a quantum processor within the country.

Several quantum partnerships were announced:

  • Riverlane and Alice & Bob will integrate Riverlane’s quantum error correction stack within Alice & Bob’s larger quantum computing system based on cat qubit technology.
  • New York University and the University of Copenhagen will collaborate to explore the viability of hybrid superconductor-semiconductor quantum materials for the production of quantum chips and integration with CMOS processes.
  • NXP, eleQtron, and ParityQC showed off a full-stack, ion-trap based quantum computer demonstrator for Germany’s DLR Quantum Computing Initiative.
  • Photonic says it demonstrated distributed entanglement between quantum modules using optically-linked silicon spin qubits with a native telecom networking interface as part of a quantum internet effort with Microsoft.
  • Classiq and HPE say they developed a rapid method for solving large-scale combinatorial optimization problems by combining quantum and classical HPC approaches.

Events and Further Reading

Find upcoming chip industry events here, including:

Event Date Location
Hardwear.io Security Trainings and Conference USA 2024 May 28 – Jun 1 Santa Clara, CA
SWTest Jun 3 – 5 Carlsbad, CA
IITC2024: Interconnect Technology Conference Jun 3 – 6 San Jose, CA
VOICE Developer Conference Jun 3 – 5 La Jolla, CA
CHIPS R&D Standardization Readiness Level Workshop Jun 4 – 5 Online and Boulder, CO
SNUG Europe: Synopsys User Group Jun 10 – 11 Munich
IEEE RAS in Data Centers Summit: Reliability, Availability and Serviceability Jun 11 – 12 Santa Clara, CA
3D & Systems Summit Jun 12 – 14 Dresden, Germany
PCI-SIG Developers Conference Jun 12 – 13 Santa Clara, CA
AI Hardware and Edge AI Summit: Europe Jun 18 – 19 London, UK
DAC 2024 Jun 23 – 27 San Francisco
Find All Upcoming Events Here

Upcoming webinars are here, including integrated SLM analytics solution, prototyping and validation of perception sensor systems, and improving PCB designs for performance and reliability.


Semiconductor Engineering’s latest newsletters:

Automotive, Security and Pervasive Computing
Systems and Design
Low Power-High Performance
Test, Measurement and Analytics
Manufacturing, Packaging and Materials

The post Chip Industry Week In Review appeared first on Semiconductor Engineering.

Will Domain-Specific ICs Become Ubiquitous?

Questions are surfacing for all types of design, ranging from small microcontrollers to leading-edge chips, over whether domain-specific design will become ubiquitous, or whether it will fall into the historic pattern of customization first, followed by lower-cost, general-purpose components.

Custom hardware always has been a double-edged sword. It can provide a competitive edge for chipmakers, but often requires more time to design, verify, and manufacture a chip, which can sometimes cost a market window. In addition, it’s often too expensive for all but the most price-resilient applications. This is a well-understood equation at the leading edge of design, particularly where new technologies such as generative AI are involved.

But with planar scaling coming to an end, and with more features tailored to specific domains, the chip industry is struggling to figure out whether the business/technical equation is undergoing a fundamental and more permanent change. This is muddied further by the fact that some 30% to 35% of all design tools today are being sold to large systems companies for chips that will never be sold commercially. In those applications, the collective savings from improved performance per watt may dwarf the cost of designing, verifying, and manufacturing a highly optimized multi-chip/multi-chiplet package across a large data center, leaving the debate about custom vs. general-purpose more uncertain than ever.

“If you go high enough in the engineering organization, you’re going to find that what people really want to do is a software-defined whatever it is,” says Russell Klein, program director for high-level synthesis at Siemens EDA. “What they really want to do is buy off-the-shelf hardware, put some software on it, make that their value-add, and ship that. That paradigm is breaking down in a number of domains. It is breaking down where we need either extremely high performance, or we need extreme efficiency. If we need higher performance than we can get from that off-the-shelf system, or we need greater efficiency, we need the battery to last longer, or we just can’t burn as much power, then we’ve got to start customizing the hardware.”

Even the selection of processing units can make a solution custom. “Domain-specific computing is already ubiquitous,” says Dave Fick, CEO and cofounder of Mythic. “Modern computers, whether in a laptop, phone, security camera, or in farm equipment, consist of a mix of hardware blocks co-optimized with software. For instance, it is common for a computer to have video encode or decode hardware units to allow a system to connect to a camera efficiently. It is common to have accelerators for encryption so that we can safely communicate. Each of these is co-optimized with software algorithms to make commonly used functions highly efficient and flexible.”

Steve Roddy, chief marketing officer at Quadric, agrees. “Heterogeneous processing in SoCs has been de rigueur in the vast majority of consumer applications for the past two decades or more.  SoCs for mobile phones, tablets, televisions, and automotive applications have long been required to meet a grueling combination of high-performance plus low-cost requirements, which has led to the proliferation of function-specific processors found in those systems today.  Even low-cost SoCs for mobile phones today have CPUs for running Android, complex GPUs to paint the display screen, audio DSPs for offloading audio playback in a low-power mode, video DSPs paired with NPUs in the camera subsystem to improve image capture (stabilization, filters, enhancement), baseband DSPs — often with attached NPUs — for high speed communications channel processing in the Wi-Fi and 5G subsystems, sensor hub fusion DSPs, and even power-management processors that maximize battery life.”

It helps to separate what you call general-purpose and what is application-specific. “There is so much benefit to be had from running your software on dedicated hardware, what we call bespoke silicon, because it gives you an advantage over your competitors,” says Marc Swinnen, director of product marketing in Ansys’ Semiconductor Division. “Your software runs faster, lower power, and is designed to run specifically what you want to run. It’s hard for a competitor with off-the-shelf hardware to compete with you. Silicon has become so central to the business value, the business model, of many companies that it has become important to have that optimized.”

There is a balance, however. “If there is any cost justification in terms of return on investment and deployment costs, power costs, thermal costs, cooling costs, then it always makes sense to build a custom ASIC,” says Sharad Chole, chief scientist and co-founder of Expedera. “We saw that for cryptocurrency, we see that right now for AI. We saw that for edge computing, which requires extremely ultra-low power sensors and ultra-low power processes. But there also has been a push for general-purpose computing hardware, because then you can easily make the applications more abstract and scalable.”

Part of the seeming conflict is due to the scope of specificity. “When you look at the architecture, it’s really the scope that determines the application specificity,” says Frank Schirrmeister, vice president of solutions and business development at Arteris. “Domain-specific computing is ubiquitous now. The important part is the constant moving up of the domain specificity to something more complex — from the original IP, to configurable IP, to subsystems that are configurable.”

In the past, it has been driven more by economics. “There’s an ebb and a flow to it,” says Paul Karazuba, vice president of marketing at Expedera. “There’s an ebb and a flow to putting everything into a processor. There’s an ebb and a flow to having co-processors, augmenting functions that are inside of that main processor. It’s a natural evolution of pretty much everything. It may not necessarily be cheaper to design your own silicon, but it may be more expensive in the long run to not design your own silicon.”

An attempt to formalize that ebb and flow was made by Tsugio Makimoto in the 1990s, when he was Sony’s CTO. He observed that electronics cycled between custom solutions and programmable ones approximately every 10 years. What’s changed is that most custom chips from the time of his observation contained highly programmable standard components.

Technology drivers
Today, it would appear that technical issues will decide this. “The industry has managed to work around power issues and push up the thermal envelope beyond points I personally thought were going to be reasonable, or feasible,” says Elad Alon, co-founder and CEO of Blue Cheetah. “We’re hitting that power limit, and when you hit the power limit it drives you toward customization wherever you can do it. But obviously, there is tension between flexibility, scalability, and applicability to the broadest market possible. This is seen in the fast pace of innovation in the AI software world, where tomorrow there could be an entirely different algorithm, and that throws out almost all the customizations one may have done.”

The slowing of Moore’s Law will have a fundamental influence on the balance point. “There have been a number of bespoke silicon companies in the past that were successful for a short period of time, but then failed,” says Ansys’ Swinnen. “They had made some kind of advance, be it architectural or addressing a new market need, but then the general-purpose chips caught up. That is because there’s so much investment in them, and there’s so many people using them, there’s an entire army of people advancing, versus your company, just your team, that’s advancing your bespoke solution. Inevitably, sooner or later, they bypass you and the general-purpose hardware just gets better than the specific one. Right now, the pendulum has swung toward custom solutions being the winner.”

However, general-purpose processors do not automatically advance if companies don’t keep up with adoption of the latest nodes, and that leads to even more opportunities. “When adding accelerators to a general-purpose processor starts to break down, because you want to go faster or become more efficient, you start to create truly customized implementations,” says Siemens’ Klein. “That’s where high-level synthesis starts to become really interesting, because you’ve got that software-defined implementation as your starting point. We can take it through high-level synthesis (HLS) and build an accelerator that’s going to do that one specific thing. We could leave a bunch of registers to define its behavior, or we can just hard code everything. The less general that system is, the more specific it is, usually the higher performance and the greater efficiency that we’re going to take away from it. And it almost always is going to be able to beat a general-purpose accelerator or certainly a general-purpose processor in terms of both performance and efficiency.”

At the same time, IP has become massively configurable. “There used to be IP as the building blocks,” says Arteris’ Schirrmeister. “Since then, the industry has produced much larger and more complex IP that takes on the role of sub-systems, and that’s where scope comes in. We have seen Arm with what they call the compute sub-systems (CSS), which are an integration and then hardened. People care about the chip as a whole, and then the chip and the system context with all that software. Application specificity has become ubiquitous in the IP space. You either build hard cores, you use a configurable core, or you use high-level synthesis. All of them are, by definition, application-specific, and the configurability plays in there.”

Put in perspective, there is more than one way to build a device, and an increasing number of options for getting it done. “There’s a really large market for specialized computing around some algorithm,” says Klein. “IP for that is going to be both in the form of discrete chips, as well as IP that could be built into something. Ultimately, that has to become silicon. It’s got to be hardened to some degree. They can set some parameters and bake it into somebody’s design. Consider an Arm processor. I can configure how many CPUs I want, I can configure how big I want the caches, and then I can go bake that into a specific implementation. That’s going to be the thing that I build, and it’s going to be more targeted. It will have better efficiency and a better cost profile and a better power profile for the thing that I’m doing. Somebody else can take it and configure it a little bit differently. And to the degree that the IP works, that’s a great solution. But there will always be algorithms that don’t have a big enough market for IP to address. And that’s where you go in and do the extreme customization.”

Chiplets
Some have questioned if the emerging chiplet industry will reverse this trend. “We will continue to see systems composed of many hardware accelerator blocks, and advanced silicon integration technologies (i.e., 3D stacking and chiplets) will make that even easier,” says Mythic’s Fick. “There are many companies working on open standards for chiplets, enabling communication bandwidth and energy efficiency that is an order of magnitude greater than what can be built on a PCB. Perhaps soon, the advanced system-in-package will overtake the PCB as the way systems are designed.”

Chiplets are not likely to be highly configurable. “Configuration in the chiplet world might become just a function of switching off things you don’t need,” says Schirrmeister. “Configuration really means that you do not use certain things. You don’t get your money back for those items. It’s all basically applying math and predicting what your volumes are going to be. If it’s an incremental cost that has one more block on it to support another interface, or making the block the Ethernet block with time triggered stuff in it for automotive, that gives you an incremental effort of X. Now, you have to basically estimate whether it also gives you a multiple of that incremental effort as incremental profit. It works out this way because chips just become very configurable. Chiplets are just going in the direction or finding the balance of more generic usage so that you can apply them in more chiplet designs.”

The chiplet market is far from certain today. “The promise of chiplets is that you use only the function that you want from the supplier that you want, in the right node, at the right location,” says Expedera’s Karazuba. “The idea of specialization and chiplets are at arm’s length. They’re actually together, but chiplets have a long way to go. There’s still not that universal agreement of the different things around a chiplet that have to be in order to make the product truly mass market.”

While chiplets have been proven to work, nearly all of the chiplets in use today are proprietary. “To build a viable [commercial] chiplet company, you have to be going after a broad enough market, large enough from a dollar perspective, then you can make all the investment, have success and get everything back accordingly,” says Blue Cheetah’s Alon. “There’s a similar tension where people would like to build a general-purpose chiplet that can be used anywhere, by anyone. That is the plug-and-play discussion, but you could finish up with something that becomes so general-purpose, with so much overhead, that it’s just not attractive in any particular market. In the chiplet case, for technical reasons, it might not actually really work that way at all. You might try to build it for general purpose, and it turns out later that it doesn’t plug into particular sockets that are of interest.”

The economics of chiplet viability have not yet been defined. “The thing about chiplets is they can be small,” says Klein. “Being small means that we don’t need as big a market for them as we would for a very large chip. We can also build them on different technologies. We can have some that are on older technologies, where transistors are cheaper, and we can combine those with other chiplets that might be leading-edge nodes where we could have general-purpose CPUs or NPU accelerators. There’s a mix-and-match, and we can do chiplets smaller than we can general-purpose chips. We can do smaller runs of them. We can take that IP and customize it for a particular market vertical and create some chiplets for that, change the configuration a bit, and do another run for something else. There’s a level of customization that can be deployed and supported by the market that’s a little bit more than we’ve seen in full-size chips, where the entire thing has to be built into one package.

Conclusion
What it means for a design to be general-purpose or custom is changing. All designs will contain some of each. Some companies will develop novel architectures using general-purpose processors, and these will be better than a fully general-purpose solution. Others will create highly customized hardware for some functions that are known to be stable, and general purpose for things that are likely to change. One thing has never changed, however. A company is not likely to add more customization than necessary to satisfy the needs of the market they are targeting.

Further Reading
Challenges With Chiplets And Power Delivery
Benefits and challenges in heterogeneous integration.
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.

The post Will Domain-Specific ICs Become Ubiquitous? appeared first on Semiconductor Engineering.

SRAM Security Concerns Grow

SRAM security concerns are intensifying as a combination of new and existing techniques allow hackers to tap into data for longer periods of time after a device is powered down.

This is particularly alarming as the leading edge of design shifts from planar SoCs to heterogeneous systems in package, such as those used in AI or edge processing, where chiplets frequently have their own memory hierarchy. Until now, most cybersecurity concerns involving volatile memory have focused on DRAM, because it is often external and easier to attack. SRAM, in contrast, does not contain a component as obviously vulnerable as a heat-sensitive capacitor, and in the past it has been harder to pinpoint. But as SoCs are disaggregated and more features are added into devices, SRAM is becoming a much bigger security concern.

The attack scheme is well understood. Known as cold boot, it was first identified in 2008, and is essentially a variant of a side-channel attack. In a cold boot approach, an attacker dumps data from internal SRAM to an external device, and then restarts the system from the external device with some code modification. “Cold boot is primarily targeted at SRAM, with the two primary defenses being isolation and in-memory encryption,” said Vijay Seshadri, distinguished engineer at Cycuity.

Compared with network-based attacks, such as DRAM’s rowhammer, cold boot is relatively simple. It relies on physical proximity and a can of compressed air.

The vulnerability was first described by Edward Felton, director of Princeton University’s Center for Information Technology Policy, J. Alex Halderman, currently director of the Center for Computer Security & Society at the University of Michigan, and colleagues. The breakthrough in their research was based on the growing realization in the engineering research community that data does not vanish from memory the moment a device is turned off, which until then was a common assumption. Instead, data in both DRAM and SRAM has a brief “remanence.”[1]

Using a cold boot approach, data can be retrieved, especially if an attacker sprays the chip with compressed air, cooling it enough to slow the degradation of the data. As the researchers described their approach, “We obtained surface temperatures of approximately −50°C with a simple cooling technique — discharging inverted cans of ‘canned air’ duster spray directly onto the chips. At these temperatures, we typically found that fewer than 1% of bits decayed even after 10 minutes without power.”

Unfortunately, despite nearly 20 years of security research since the publication of the Halderman paper, the authors’ warning still holds true. “Though we discuss several strategies for mitigating these risks, we know of no simple remedy that would eliminate them.”

However unrealistic, there is one simple and obvious remedy to cold boot — never leave a device unattended. But given human behavior, it’s safer to assume that every device is vulnerable, from smart watches to servers, as well as automotive chips used for increasingly autonomous driving.

While the original research exclusively examined DRAM, within the last six years cold boot has proven to be one of the most serious vulnerabilities for SRAM. In 2018, researchers at Germany’s Technische Universität Darmstadt published a paper describing a cold boot attack method that is highly resistant to memory erasure techniques, and which can be used to manipulate the cryptographic keys produced by the SRAM physical unclonable function (PUF).

As with so many security issues, it’s been a cat-and-mouse game between remedies and counter-attacks. And because cold boot takes advantage of slowing down memory degradation, in 2022 Yang-Kyu Choi and colleagues at the Korea Advanced Institute of Science and Technology (KAIST), described a way to undo the slowdown with an ultra-fast data sanitization method that worked within 5 ns, using back bias to control the device parameters of CMOS.

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Their paper, as well as others, have inspired new approaches to combating cold boot attacks.

“To mitigate the risk of unauthorized access from unknown devices, main devices, or servers, check the authenticated code and unique identity of each accessing device,” said Jongsin Yun, memory technologist at Siemens EDA. “SRAM PUF is one of the ways to securely identify each device. SRAM is made of two inverters cross-coupled to each other. Although each inverter is designed to be the same device, normally one part of the inverter has a somewhat stronger NMOS than the other due to inherent random dopant fluctuation. During the initial power-on process, SRAM data will be either data 1 or 0, depending on which side has a stronger device. In other words, the initial data state of the SRAM array at the power on is decided by this unique random process variation and most of the bits maintain this property for life. One can use this unique pattern as a fingerprint of a device. The SRAM PUF data is reconstructed with other coded data to form a cryptographic key. SRAM PUF is a great way to anchor its secure data into hardware. Hackers may use a DFT circuit to access the memory. To avoid insecurely reading the SRAM information through DFT, the security-critical design makes DFT force delete the data as an initial process of TEST mode.”

However, there can be instances where data may be required to be kept in a non-volatile memory (NVM). “Data is considered insecure if the NVM is located outside of the device,” said Yun. “Therefore, secured data needs to be stored within the device with write protection. One-time programmable (OTP) memory or fuses are good storage options to prevent malicious attackers from tampering with the modified information. OTP memory and fuses are used to store cryptographic keys, authentication information, and other critical settings for operation within the device. It is useful for anti-rollback, which prevents hackers from exploiting old vulnerabilities that have been fixed in newer versions.”

Chiplet vulnerabilities
Chiplets also could present another vector for attack, due to their complexity and interconnections. “A chiplet has memory, so it’s going to be attacked,” said Cycuity’s Seshadri. “Chiplets, in general, are going to exacerbate the problem, rather than keeping it status quo, because you’re going to have one chiplet talking to another. Could an attack on one chiplet have a side effect on another? There need to be standards to address this. In fact, they’re coming into play already. A chiplet provider has to say, ‘Here’s what I’ve done for security. Here’s what needs to be done when interfacing with another chiplet.”

Yun notes there is a further physical vulnerability for those working with chiplets and SiPs. “When multiple chiplets are connected to form a SiP, we have to trust data coming from an external chip, which creates further complications. Verification of the chiplet’s authenticity becomes very important for SiPs, as there is a risk of malicious counterfeit chiplets being connected to the package for hacking purposes. Detection of such counterfeit chiplets is imperative.”

These precautions also apply when working with DRAM. In all situations, Seshardi said, thinking about security has to go beyond device-level protection. “The onus of protecting DRAM is not just on the DRAM designer or the memory designer,” he said. “It has to be secured by design principles when you are developing. In addition, you have to look at this holistically and do it at a system level. You must consider all the other things that communicate with DRAM or that are placed near DRAM. You must look at a holistic solution, all the way from software down to things like the memory controller and then finally, the DRAM itself.”

Encryption as a backup
Data itself always must be encrypted as second layer of protection against known and novel attacks, so an organization’s assets will still be protected even if someone breaks in via cold boot or another method.

“The first and primary method of preventing a cold boot attack is limiting physical access to the systems, or physically modifying the systems case or hardware preventing an attacker’s access,” said Jim Montgomery, market development director, semiconductor at TXOne Networks. “The most effective programmatic defense against an attack is to ensure encryption of memory using either a hardware- or software-based approach. Utilizing memory encryption will ensure that regardless of trying to dump the memory, or physically removing the memory, the encryption keys will remain secure.”

Montgomery also points out that TXOne is working with the Semiconductor Manufacturing Cybersecurity Consortium (SMCC) to develop common criteria based upon SEMI E187 and E188 standards to assist DM’s and OEM’s to implement secure procedures for systems security and integrity, including controlling the physical environment.

What kind and how much encryption will depend on use cases, said Jun Kawaguchi, global marketing executive for Winbond. “Encryption strength for a traffic signal controller is going to be different from encryption for nuclear plants or medical devices, critical applications where you need much higher levels,” he said. “There are different strengths and costs to it.”

Another problem, in the post-quantum era, is that encryption itself may be vulnerable. To defend against those possibilities, researchers are developing post-quantum encryption schemes. One way to stay a step ahead is homomorphic encryption [HE], which will find a role in data sharing, since computations can be performed on encrypted data without first having to decrypt it.

Homomorphic encryption could be in widespread use as soon as the next few years, according to Ronen Levy, senior manager for IBM’s Cloud Security & Privacy Technologies Department, and Omri Soceanu, AI Security Group manager at IBM.  However, there are still challenges to be overcome.

“There are three main inhibitors for widespread adoption of homomorphic encryption — performance, consumability, and standardization,” according to Levy. “The main inhibitor, by far, is performance. Homomorphic encryption comes with some latency and storage overheads. FHE hardware acceleration will be critical to solving these issues, as well as algorithmic and cryptographic solutions, but without the necessary expertise it can be quite challenging.”

An additional issue is that most consumers of HE technology, such as data scientists and application developers, do not possess deep cryptographic skills, HE solutions that are designed for cryptographers can be impractical. A few HE solutions require algorithmic and cryptographic expertise that inhibit adoption by those who lack these skills.

Finally, there is a lack of standardization. “Homomorphic encryption is in the process of being standardized,” said Soceanu. “But until it is fully standardized, large organizations may be hesitant to adopt a cryptographic solution that has not been approved by standardization bodies.”

Once these issues are resolved, they predicted widespread use as soon as the next few years. “Performance is already practical for a variety of use cases, and as hardware solutions for homomorphic encryption become a reality, more use cases would become practical,” said Levy. “Consumability is addressed by creating more solutions, making it easier and hopefully as frictionless as possible to move analytics to homomorphic encryption. Additionally, standardization efforts are already in progress.”

A new attack and an old problem
Unfortunately, security never will be as simple as making users more aware of their surroundings. Otherwise, cold boot could be completely eliminated as a threat. Instead, it’s essential to keep up with conference talks and the published literature, as graduate students keep probing SRAM for vulnerabilities, hopefully one step ahead of genuine attackers.

For example, SRAM-related cold boot attacks originally targeted discrete SRAM. The reason is that it’s far more complicated to attack on-chip SRAM, which is isolated from external probing and has minimal intrinsic capacitance. However, in 2022, Jubayer Mahmod, then a graduate student at Virginia Tech and his advisor, associate professor Matthew Hicks, demonstrated what they dubbed “Volt Boot,” a new method that could penetrate on-chip SRAM. According to their paper, “Volt Boot leverages asymmetrical power states (e.g., on vs. off) to force SRAM state retention across power cycles, eliminating the need for traditional cold boot attack enablers, such as low-temperature or intrinsic data retention time…Unlike other forms of SRAM data retention attacks, Volt Boot retrieves data with 100% accuracy — without any complex post-processing.”

Conclusion
While scientists and engineers continue to identify vulnerabilities and develop security solutions, decisions about how much security to include in a design is an economic one. Cost vs. risk is a complex formula that depends on the end application, the impact of a breach, and the likelihood that an attack will occur.

“It’s like insurance,” said Kawaguchi. “Security engineers and people like us who are trying to promote security solutions get frustrated because, similar to insurance pitches, people respond with skepticism. ‘Why would I need it? That problem has never happened before.’ Engineers have a hard time convincing their managers to spend that extra dollar on the costs because of this ‘it-never-happened-before’ attitude. In the end, there are compromises. Yet ultimately, it’s going to cost manufacturers a lot of money when suddenly there’s a deluge of demands to fix this situation right away.”

References

  1. S. Skorobogatov, “Low temperature data remanence in static RAM”, Technical report UCAM-CL-TR-536, University of Cambridge Computer Laboratory, June 2002.
  2. Han, SJ., Han, JK., Yun, GJ. et al. Ultra-fast data sanitization of SRAM by back-biasing to resist a cold boot attack. Sci Rep 12, 35 (2022). https://doi.org/10.1038/s41598-021-03994-2

The post SRAM Security Concerns Grow appeared first on Semiconductor Engineering.

Overcoming Chiplet Integration Challenges With Adaptability

Chiplets are exploding in popularity due to key benefits such as lower cost, lower power, higher performance and greater flexibility to meet specific market requirements. More importantly, chiplets can reduce time-to-market, thus decreasing time-to-revenue! Heterogeneous and modular SoC design can accelerate innovation and adaptation for many companies. What’s not to like about chiplets? Well, as chiplets come to fruition, we are starting to realize many of the complications of chiplet designs.

Fig. 1: Expanded chiplet system view.

The interface challenge

The primary concept of chiplets is integration of ICs (Integrated Circuits) from multiple companies. However, many of these did not fully consider interoperation with other ICs. That’s partially based on a lack of interconnect standards around chiplets. Moreover, ICs have their own computational and bandwidth requirements. This is further complicated as competing interfaces standards vie for adoption as shown in table 1.

Standard d Throughput Density Max Delay
Advanced Interface Bus (Intel, AIB) 2 Gbps 504 Gbps/mm 5 ns
Bandwidth Engine 10.3 Gbps N/A 2.4 ns
BoW (Bunch of Wires) 16 Gbps 1280 Gbps/mm 5 ns
HBM3 (JEDEC) 4.8 Gbps N/A N/A
Infinity Fabric (AMD) 10.6 Gbps N/A 9 ns
Lipincon (TSMC) 2.8 Gbps 536 Gbps/mm 14ns
Multi-Die I/O (Intel) 5.4 Gbps 1600 Gbps/mm N/A
XSR/USR (Rambus) 112 Gbps N/A N/A
UCIe 32 Gbps 1350 Gbps/mm 2 ns

Table 1: Chiplet interconnect options.

Most chiplet interconnects are dominated by UCIe (Universal Chiplet Interconnect) and the unimaginatively named BoW (Bunch of Wires). UCIe introduced the 1.0 spec, and as with any first edition of a specification, it is inevitable updates will follow. UCIe 1.1 fixes several holes and gaps in 1.0. It addresses gray areas, missing definitions, ECNs and more. And it is very likely not the last update, as UCIe’s vision is to grow up the stack – adding additional protocol layers on top of the system layers.

Because of newness and expected evolution of UCIe and BoW protocols, designing them in is risky. Additionally, there will always be a place for multiple die-to-die interfaces, beyond UCIe. Specific use cases and designs will inherently be matched to different metrics leading for many designs to fall back to proprietary interfaces.

As you can see, there are many choices, and many of these have tradeoffs. Integration into a chiplet with a variety of these protocols would greatly benefit from adaptability via data/protocol adaptation that can easily be enabled with embedded programmable logic, or eFPGA. A lightweight protocol shim implemented in eFPGA IP can not only reformat data but also buffer data to maximize internal processing. Finally, consider that data between ICs in a chiplet can be globally asynchronous – another easy task resolved with eFPGA IP with FIFO synchronizers.

The security challenge

Beyond the interfaces, security is another emerging challenge. A few factors of chiplets must be cautiously considered:

  • Varying ICs from unknown and possibly unreputable manufacturers
  • IC can contain internal IP from additional third-party sources
  • Each IC may receive and introduce external data into the system

Naturally, this begs for attestation and provenance to ensure vendor confidence. As such, root of trust generally starts with the supply chain and auditing all vendors. However, it only takes one failed component, the least secure component, to jeopardize the entire system.

Root of trust suddenly becomes an issue and uncovers another issue. Which IC, or ICs, in the chiplet manage root of trust? As we’ve seen time and time again, security threats evolve at an alarming rate. But chiplets have an opportunity here. Again, embedded FPGAs have the flexible nature to adapt, thus thwarting these evolving security threats. eFPGA IP can also physically disable unused interfaces – minimizing surface attack vectors.

Adaptable cryptography cores can perform a variety of tasks with high performance in eFPGA IP. These tasks include authentication/digital signing, key generation, encapsulation/decapsulation, random number generation and much more. Further, post-quantum security cores that run very efficiently on eFPGA are becoming available. Figure 2 shows a ML Kyber Encapsulation Module from Xiphera that fits into only four Flex Logix EFLX tiles, efficiently packed at 98% utilization with a throughput of over 2 Gbps.

Fig. 2: ML-KEM IP core from Xiphera implemented on Flex Logix EFLX eFPGA IP.

Managing all data communication within a chiplet seems daunting; however, it is feasible. Designers have the choice of implementing eFPGA on every IC in the chiplet for adaptable data signage. Or standalone on the interposer, where system designers can define a secure enclave in which all data is authenticated and encrypted by an independent IC with eFPGA. eFPGA can also process streaming data at a very high rate. And in most cases can keep up with line rate, as seen with programmable data planes in SmartNICs.

eFPGA can add another critical security benefit. Every instance of eFPGA in the chiplet offers the ability to obfuscate critical algorithms, cryptography and protocols. This enables manufacturers to protect design secrets by not only programming these features in a controlled environment, but also adapting these as threats evolve.

The validation problem

Again, the absence of fully defined industry standards presents integration challenges. Conventional methods of qualification, testing, and validation become increasingly more complex. Yet this becomes another opportunity for eFPGA IP. It can be configured as an in-system diagnostic tool that provides testing, debugging and observability. Not only during IC bring up, but also during run time – eliminating finger pointing between independent companies.

The reconfigurability solution

While we’ve discussed a few different chiplet issues and solutions with adaptable eFPGA, it is important to realize that a singular instance of this IP can perform all these functions in a chiplet, as eFPGA IP is completely reconfigurable. It can be time-sliced and uniquely configured differently during specific operational phases of the chiplet. As mentioned in the examples above, during IC bring up it can provide insightful debug visibility into the system. During boot, it can enable secure boot and attested firmware updates to all ICs in the chiplet. During run time, it can perform cryptographic functions as well independently manage a secure enclave environment. eFPGA is also perfect for any other software acceleration your applications need, as its heavily parallel and pipelined nature is perfect for complex signal processing tasks. Lastly, during an RMA process it can also investigate and determine system failures. This is just a short list of the features eFPGA IP can enable in a chiplet.

Customizable for the perfect solution

Flex Logix EFLX IP delivers excellent PPA (Power, Performance and Area) and is available on the most advanced nodes, including Intel 18A and TSMC 7nm and 5nm. Furthermore, Flex Logix eFPGA IP is scalable – enabling you to choose the best balance of programmable logic, embedded memory and signal processing resources.

Fig. 3: Scalable Flex Logix eFPGA IP.

Want to learn more about Flex Logix IP? Contact us at [email protected] or visit our website https://flex-logix.com.

The post Overcoming Chiplet Integration Challenges With Adaptability appeared first on Semiconductor Engineering.

UCIe And Automotive Electronics: Pioneering The Chiplet Revolution

The automotive industry stands at the brink of a profound transformation fueled by the relentless march of technological innovation. Gone are the days of the traditional, one-size-fits-all system-on-chip (SoC) design framework. Today, we are witnessing a paradigm shift towards a more modular approach that utilizes diverse chiplets, each optimized for specific functionalities. This evolution promises to enhance the automotive system’s flexibility and efficiency and revolutionize how vehicles are designed, built, and operated.

At the heart of this transformation lies the Universal Chiplet Interconnect Express (UCIe), a groundbreaking standard introduced in March 2022. UCIe is designed to drastically simplify the integration process across different chiplets from various manufacturers by standardizing die-to-die connections. This initiative caters to a critical need within the industry for a modular and scalable semiconductor architecture, thus setting the stage for unparalleled innovation in automotive electronics.

Understanding the core benefit of UCIe

UCIe isn’t merely about facilitating smoother communication between chiplets; it’s a visionary standard that ensures interoperability, reduces design complexity and costs, and, crucially, supports the seamless incorporation of chiplets into cohesive packages. Developed through the collaborative efforts of the UCIe consortium, this standard enables the use of chiplets from different vendors, thereby fostering a highly customizable and scalable solution. This mainly benefits the automotive sector, where high performance, reliability, and efficiency demand is paramount.

The pivotal role of UCIe in automotive electronics

Implementing UCIe within automotive electronics unlocks a plethora of advantages. Foremost among these is the ability to design more compact, powerful, and energy-efficient electronic systems. UCIe offers savings in energy per bit compared to other serial or parallel interfaces.

Given the increasing complexity and functionality of modern vehicles — which can be likened to data centers on wheels — the modular design principle of UCIe is invaluable. It facilitates the addition of new functionalities and ensures that automotive electronics can seamlessly adapt to and incorporate emerging technologies and standards.

Secondly, the adoption of UCIe marks a significant stride toward sustainable electronic design within the automotive industry. By promoting the reuse and integration of chiplets across different platforms and vehicle models, UCIe significantly mitigates electronic waste and streamlines the lifecycle management of electronic components. This benefits manufacturers regarding cost and efficiency and aligns with broader industry trends focused on sustainability and environmental stewardship.

Offerings and solutions

Cadence offers a range of Intellectual Properties (IPs), Electronic Design Automation (EDA) tools, and 3D Integrated Circuits (ICs) tailored for chiplets.

Since the beginning of UCIe, Cadence has engaged with over 100 customers and enables automotive solutions with UCIe and its working groups. The implementation features for automotive protect the mainband, and Cadence also adds protection for the sideband, particularly for the automotive industry. UCIe is being developed across multiple foundries, process nodes, and standard and advanced packaging enablement types.

Cadence has a history of chiplet experience with an interface IP portfolio, including our chiplet die-to-die interconnects, including the UCIe IP, and proprietary 40G UltraLink D2D PHY that enables SoC providers to deliver more customized solutions that offer higher performance and yields while also shortening development cycles and reducing costs through greater IP reuse. Cadence’s UCIe PHY solutions facilitate high-speed communication and interoperability among chiplets. Additionally, the Cadence Simulation VIP for UCIe ensures that systems using UCIe are reliable and performant, addressing one of the industry’s key concerns.

The Cadence UCIe PHY is a high-bandwidth, low-power, and low-latency die-to-die solution that enables multi-die system in package integration for high-performance compute, AI/ML, 5G, automotive and networking applications. The UCIe physical layer includes link initialization, training, power management states, lane mapping, lane reversal, and scrambling. The UCIe controller includes the die-to-die adapter layer and the protocol layer. The adapter layer ensures reliable transfer through link state management, protocol, and flit formats parameter negotiation. The UCIe architecture supports multiple standard protocols such as PCIe, CXL and streaming raw mode.

For interface IP, we usually create various test chips using different process technologies and measure the silicon in the lab to prove that our controller and PHY IPs fully comply with the standard. Hence, we designed a test chip with seven chiplets connected via UCIe over different interconnect distances. To learn more, click here.

The Cadence Verification IP (VIP) for Universal Chiplet Interconnect Express (UCIe) is designed for easy integration in test benches at the IP, system-on-chip (SoC), and system level. The VIP for UCIe runs on all simulators and supports SystemVerilog and the widely adopted Universal Verification Methodology (UVM). This enables verification teams to reduce the time spent on environment development and redirect it to cover a larger verification space, accelerate verification closure, and ensure end-product quality. With a layered architecture and powerful callback mechanism, verification engineers can verify UCIe features at each functional layer (PHY, D2D, Protocol) and create highly targeted designs while using the latest design methodologies for random testing to cover a larger verification space. The VIP for UCIe can be used as a standalone stack or layered with PCIe VIP.

Challenges and future prospects

The transition to chiplet-based designs and the widespread adoption of the UCIe standard are not without their challenges. Critical concerns include functional safety, reliability, quality, security, thermal management, and mechanical stress. Hence, ensuring dependable high-speed interconnects between chiplets necessitates ongoing research and development. Successful implementation also hinges on industry-wide collaboration and establishing a robust chiplet ecosystem that supports the UCIe standard.

UCIe stands as a pioneering innovation poised to redefine automotive electronics. By offering a scalable and flexible framework, UCIe promises to enhance in-vehicle electronic systems’ performance, efficiency, and versatility and marks a significant milestone in the automotive industry’s evolution. As we look to the future, the impact of UCIe in the automotive sector is poised to be profound, turning vehicles into dynamic platforms that continuously adapt to technological advancements and user needs. The realization of modular, customizable designs underscored by efficiency, performance, and innovation heralds the dawn of a new era in semiconductor development, one where the possibilities are as boundless as the technological horizon.

The post UCIe And Automotive Electronics: Pioneering The Chiplet Revolution appeared first on Semiconductor Engineering.

SRAM Security Concerns Grow

SRAM security concerns are intensifying as a combination of new and existing techniques allow hackers to tap into data for longer periods of time after a device is powered down.

This is particularly alarming as the leading edge of design shifts from planar SoCs to heterogeneous systems in package, such as those used in AI or edge processing, where chiplets frequently have their own memory hierarchy. Until now, most cybersecurity concerns involving volatile memory have focused on DRAM, because it is often external and easier to attack. SRAM, in contrast, does not contain a component as obviously vulnerable as a heat-sensitive capacitor, and in the past it has been harder to pinpoint. But as SoCs are disaggregated and more features are added into devices, SRAM is becoming a much bigger security concern.

The attack scheme is well understood. Known as cold boot, it was first identified in 2008, and is essentially a variant of a side-channel attack. In a cold boot approach, an attacker dumps data from internal SRAM to an external device, and then restarts the system from the external device with some code modification. “Cold boot is primarily targeted at SRAM, with the two primary defenses being isolation and in-memory encryption,” said Vijay Seshadri, distinguished engineer at Cycuity.

Compared with network-based attacks, such as DRAM’s rowhammer, cold boot is relatively simple. It relies on physical proximity and a can of compressed air.

The vulnerability was first described by Edward Felton, director of Princeton University’s Center for Information Technology Policy, J. Alex Halderman, currently director of the Center for Computer Security & Society at the University of Michigan, and colleagues. The breakthrough in their research was based on the growing realization in the engineering research community that data does not vanish from memory the moment a device is turned off, which until then was a common assumption. Instead, data in both DRAM and SRAM has a brief “remanence.”[1]

Using a cold boot approach, data can be retrieved, especially if an attacker sprays the chip with compressed air, cooling it enough to slow the degradation of the data. As the researchers described their approach, “We obtained surface temperatures of approximately −50°C with a simple cooling technique — discharging inverted cans of ‘canned air’ duster spray directly onto the chips. At these temperatures, we typically found that fewer than 1% of bits decayed even after 10 minutes without power.”

Unfortunately, despite nearly 20 years of security research since the publication of the Halderman paper, the authors’ warning still holds true. “Though we discuss several strategies for mitigating these risks, we know of no simple remedy that would eliminate them.”

However unrealistic, there is one simple and obvious remedy to cold boot — never leave a device unattended. But given human behavior, it’s safer to assume that every device is vulnerable, from smart watches to servers, as well as automotive chips used for increasingly autonomous driving.

While the original research exclusively examined DRAM, within the last six years cold boot has proven to be one of the most serious vulnerabilities for SRAM. In 2018, researchers at Germany’s Technische Universität Darmstadt published a paper describing a cold boot attack method that is highly resistant to memory erasure techniques, and which can be used to manipulate the cryptographic keys produced by the SRAM physical unclonable function (PUF).

As with so many security issues, it’s been a cat-and-mouse game between remedies and counter-attacks. And because cold boot takes advantage of slowing down memory degradation, in 2022 Yang-Kyu Choi and colleagues at the Korea Advanced Institute of Science and Technology (KAIST), described a way to undo the slowdown with an ultra-fast data sanitization method that worked within 5 ns, using back bias to control the device parameters of CMOS.

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Their paper, as well as others, have inspired new approaches to combating cold boot attacks.

“To mitigate the risk of unauthorized access from unknown devices, main devices, or servers, check the authenticated code and unique identity of each accessing device,” said Jongsin Yun, memory technologist at Siemens EDA. “SRAM PUF is one of the ways to securely identify each device. SRAM is made of two inverters cross-coupled to each other. Although each inverter is designed to be the same device, normally one part of the inverter has a somewhat stronger NMOS than the other due to inherent random dopant fluctuation. During the initial power-on process, SRAM data will be either data 1 or 0, depending on which side has a stronger device. In other words, the initial data state of the SRAM array at the power on is decided by this unique random process variation and most of the bits maintain this property for life. One can use this unique pattern as a fingerprint of a device. The SRAM PUF data is reconstructed with other coded data to form a cryptographic key. SRAM PUF is a great way to anchor its secure data into hardware. Hackers may use a DFT circuit to access the memory. To avoid insecurely reading the SRAM information through DFT, the security-critical design makes DFT force delete the data as an initial process of TEST mode.”

However, there can be instances where data may be required to be kept in a non-volatile memory (NVM). “Data is considered insecure if the NVM is located outside of the device,” said Yun. “Therefore, secured data needs to be stored within the device with write protection. One-time programmable (OTP) memory or fuses are good storage options to prevent malicious attackers from tampering with the modified information. OTP memory and fuses are used to store cryptographic keys, authentication information, and other critical settings for operation within the device. It is useful for anti-rollback, which prevents hackers from exploiting old vulnerabilities that have been fixed in newer versions.”

Chiplet vulnerabilities
Chiplets also could present another vector for attack, due to their complexity and interconnections. “A chiplet has memory, so it’s going to be attacked,” said Cycuity’s Seshadri. “Chiplets, in general, are going to exacerbate the problem, rather than keeping it status quo, because you’re going to have one chiplet talking to another. Could an attack on one chiplet have a side effect on another? There need to be standards to address this. In fact, they’re coming into play already. A chiplet provider has to say, ‘Here’s what I’ve done for security. Here’s what needs to be done when interfacing with another chiplet.”

Yun notes there is a further physical vulnerability for those working with chiplets and SiPs. “When multiple chiplets are connected to form a SiP, we have to trust data coming from an external chip, which creates further complications. Verification of the chiplet’s authenticity becomes very important for SiPs, as there is a risk of malicious counterfeit chiplets being connected to the package for hacking purposes. Detection of such counterfeit chiplets is imperative.”

These precautions also apply when working with DRAM. In all situations, Seshardi said, thinking about security has to go beyond device-level protection. “The onus of protecting DRAM is not just on the DRAM designer or the memory designer,” he said. “It has to be secured by design principles when you are developing. In addition, you have to look at this holistically and do it at a system level. You must consider all the other things that communicate with DRAM or that are placed near DRAM. You must look at a holistic solution, all the way from software down to things like the memory controller and then finally, the DRAM itself.”

Encryption as a backup
Data itself always must be encrypted as second layer of protection against known and novel attacks, so an organization’s assets will still be protected even if someone breaks in via cold boot or another method.

“The first and primary method of preventing a cold boot attack is limiting physical access to the systems, or physically modifying the systems case or hardware preventing an attacker’s access,” said Jim Montgomery, market development director, semiconductor at TXOne Networks. “The most effective programmatic defense against an attack is to ensure encryption of memory using either a hardware- or software-based approach. Utilizing memory encryption will ensure that regardless of trying to dump the memory, or physically removing the memory, the encryption keys will remain secure.”

Montgomery also points out that TXOne is working with the Semiconductor Manufacturing Cybersecurity Consortium (SMCC) to develop common criteria based upon SEMI E187 and E188 standards to assist DM’s and OEM’s to implement secure procedures for systems security and integrity, including controlling the physical environment.

What kind and how much encryption will depend on use cases, said Jun Kawaguchi, global marketing executive for Winbond. “Encryption strength for a traffic signal controller is going to be different from encryption for nuclear plants or medical devices, critical applications where you need much higher levels,” he said. “There are different strengths and costs to it.”

Another problem, in the post-quantum era, is that encryption itself may be vulnerable. To defend against those possibilities, researchers are developing post-quantum encryption schemes. One way to stay a step ahead is homomorphic encryption [HE], which will find a role in data sharing, since computations can be performed on encrypted data without first having to decrypt it.

Homomorphic encryption could be in widespread use as soon as the next few years, according to Ronen Levy, senior manager for IBM’s Cloud Security & Privacy Technologies Department, and Omri Soceanu, AI Security Group manager at IBM.  However, there are still challenges to be overcome.

“There are three main inhibitors for widespread adoption of homomorphic encryption — performance, consumability, and standardization,” according to Levy. “The main inhibitor, by far, is performance. Homomorphic encryption comes with some latency and storage overheads. FHE hardware acceleration will be critical to solving these issues, as well as algorithmic and cryptographic solutions, but without the necessary expertise it can be quite challenging.”

An additional issue is that most consumers of HE technology, such as data scientists and application developers, do not possess deep cryptographic skills, HE solutions that are designed for cryptographers can be impractical. A few HE solutions require algorithmic and cryptographic expertise that inhibit adoption by those who lack these skills.

Finally, there is a lack of standardization. “Homomorphic encryption is in the process of being standardized,” said Soceanu. “But until it is fully standardized, large organizations may be hesitant to adopt a cryptographic solution that has not been approved by standardization bodies.”

Once these issues are resolved, they predicted widespread use as soon as the next few years. “Performance is already practical for a variety of use cases, and as hardware solutions for homomorphic encryption become a reality, more use cases would become practical,” said Levy. “Consumability is addressed by creating more solutions, making it easier and hopefully as frictionless as possible to move analytics to homomorphic encryption. Additionally, standardization efforts are already in progress.”

A new attack and an old problem
Unfortunately, security never will be as simple as making users more aware of their surroundings. Otherwise, cold boot could be completely eliminated as a threat. Instead, it’s essential to keep up with conference talks and the published literature, as graduate students keep probing SRAM for vulnerabilities, hopefully one step ahead of genuine attackers.

For example, SRAM-related cold boot attacks originally targeted discrete SRAM. The reason is that it’s far more complicated to attack on-chip SRAM, which is isolated from external probing and has minimal intrinsic capacitance. However, in 2022, Jubayer Mahmod, then a graduate student at Virginia Tech and his advisor, associate professor Matthew Hicks, demonstrated what they dubbed “Volt Boot,” a new method that could penetrate on-chip SRAM. According to their paper, “Volt Boot leverages asymmetrical power states (e.g., on vs. off) to force SRAM state retention across power cycles, eliminating the need for traditional cold boot attack enablers, such as low-temperature or intrinsic data retention time…Unlike other forms of SRAM data retention attacks, Volt Boot retrieves data with 100% accuracy — without any complex post-processing.”

Conclusion
While scientists and engineers continue to identify vulnerabilities and develop security solutions, decisions about how much security to include in a design is an economic one. Cost vs. risk is a complex formula that depends on the end application, the impact of a breach, and the likelihood that an attack will occur.

“It’s like insurance,” said Kawaguchi. “Security engineers and people like us who are trying to promote security solutions get frustrated because, similar to insurance pitches, people respond with skepticism. ‘Why would I need it? That problem has never happened before.’ Engineers have a hard time convincing their managers to spend that extra dollar on the costs because of this ‘it-never-happened-before’ attitude. In the end, there are compromises. Yet ultimately, it’s going to cost manufacturers a lot of money when suddenly there’s a deluge of demands to fix this situation right away.”

References

  1. S. Skorobogatov, “Low temperature data remanence in static RAM”, Technical report UCAM-CL-TR-536, University of Cambridge Computer Laboratory, June 2002.
  2. Han, SJ., Han, JK., Yun, GJ. et al. Ultra-fast data sanitization of SRAM by back-biasing to resist a cold boot attack. Sci Rep 12, 35 (2022). https://doi.org/10.1038/s41598-021-03994-2

The post SRAM Security Concerns Grow appeared first on Semiconductor Engineering.

Overcoming Chiplet Integration Challenges With Adaptability

Chiplets are exploding in popularity due to key benefits such as lower cost, lower power, higher performance and greater flexibility to meet specific market requirements. More importantly, chiplets can reduce time-to-market, thus decreasing time-to-revenue! Heterogeneous and modular SoC design can accelerate innovation and adaptation for many companies. What’s not to like about chiplets? Well, as chiplets come to fruition, we are starting to realize many of the complications of chiplet designs.

Fig. 1: Expanded chiplet system view.

The interface challenge

The primary concept of chiplets is integration of ICs (Integrated Circuits) from multiple companies. However, many of these did not fully consider interoperation with other ICs. That’s partially based on a lack of interconnect standards around chiplets. Moreover, ICs have their own computational and bandwidth requirements. This is further complicated as competing interfaces standards vie for adoption as shown in table 1.

Standard d Throughput Density Max Delay
Advanced Interface Bus (Intel, AIB) 2 Gbps 504 Gbps/mm 5 ns
Bandwidth Engine 10.3 Gbps N/A 2.4 ns
BoW (Bunch of Wires) 16 Gbps 1280 Gbps/mm 5 ns
HBM3 (JEDEC) 4.8 Gbps N/A N/A
Infinity Fabric (AMD) 10.6 Gbps N/A 9 ns
Lipincon (TSMC) 2.8 Gbps 536 Gbps/mm 14ns
Multi-Die I/O (Intel) 5.4 Gbps 1600 Gbps/mm N/A
XSR/USR (Rambus) 112 Gbps N/A N/A
UCIe 32 Gbps 1350 Gbps/mm 2 ns

Table 1: Chiplet interconnect options.

Most chiplet interconnects are dominated by UCIe (Universal Chiplet Interconnect) and the unimaginatively named BoW (Bunch of Wires). UCIe introduced the 1.0 spec, and as with any first edition of a specification, it is inevitable updates will follow. UCIe 1.1 fixes several holes and gaps in 1.0. It addresses gray areas, missing definitions, ECNs and more. And it is very likely not the last update, as UCIe’s vision is to grow up the stack – adding additional protocol layers on top of the system layers.

Because of newness and expected evolution of UCIe and BoW protocols, designing them in is risky. Additionally, there will always be a place for multiple die-to-die interfaces, beyond UCIe. Specific use cases and designs will inherently be matched to different metrics leading for many designs to fall back to proprietary interfaces.

As you can see, there are many choices, and many of these have tradeoffs. Integration into a chiplet with a variety of these protocols would greatly benefit from adaptability via data/protocol adaptation that can easily be enabled with embedded programmable logic, or eFPGA. A lightweight protocol shim implemented in eFPGA IP can not only reformat data but also buffer data to maximize internal processing. Finally, consider that data between ICs in a chiplet can be globally asynchronous – another easy task resolved with eFPGA IP with FIFO synchronizers.

The security challenge

Beyond the interfaces, security is another emerging challenge. A few factors of chiplets must be cautiously considered:

  • Varying ICs from unknown and possibly unreputable manufacturers
  • IC can contain internal IP from additional third-party sources
  • Each IC may receive and introduce external data into the system

Naturally, this begs for attestation and provenance to ensure vendor confidence. As such, root of trust generally starts with the supply chain and auditing all vendors. However, it only takes one failed component, the least secure component, to jeopardize the entire system.

Root of trust suddenly becomes an issue and uncovers another issue. Which IC, or ICs, in the chiplet manage root of trust? As we’ve seen time and time again, security threats evolve at an alarming rate. But chiplets have an opportunity here. Again, embedded FPGAs have the flexible nature to adapt, thus thwarting these evolving security threats. eFPGA IP can also physically disable unused interfaces – minimizing surface attack vectors.

Adaptable cryptography cores can perform a variety of tasks with high performance in eFPGA IP. These tasks include authentication/digital signing, key generation, encapsulation/decapsulation, random number generation and much more. Further, post-quantum security cores that run very efficiently on eFPGA are becoming available. Figure 2 shows a ML Kyber Encapsulation Module from Xiphera that fits into only four Flex Logix EFLX tiles, efficiently packed at 98% utilization with a throughput of over 2 Gbps.

Fig. 2: ML-KEM IP core from Xiphera implemented on Flex Logix EFLX eFPGA IP.

Managing all data communication within a chiplet seems daunting; however, it is feasible. Designers have the choice of implementing eFPGA on every IC in the chiplet for adaptable data signage. Or standalone on the interposer, where system designers can define a secure enclave in which all data is authenticated and encrypted by an independent IC with eFPGA. eFPGA can also process streaming data at a very high rate. And in most cases can keep up with line rate, as seen with programmable data planes in SmartNICs.

eFPGA can add another critical security benefit. Every instance of eFPGA in the chiplet offers the ability to obfuscate critical algorithms, cryptography and protocols. This enables manufacturers to protect design secrets by not only programming these features in a controlled environment, but also adapting these as threats evolve.

The validation problem

Again, the absence of fully defined industry standards presents integration challenges. Conventional methods of qualification, testing, and validation become increasingly more complex. Yet this becomes another opportunity for eFPGA IP. It can be configured as an in-system diagnostic tool that provides testing, debugging and observability. Not only during IC bring up, but also during run time – eliminating finger pointing between independent companies.

The reconfigurability solution

While we’ve discussed a few different chiplet issues and solutions with adaptable eFPGA, it is important to realize that a singular instance of this IP can perform all these functions in a chiplet, as eFPGA IP is completely reconfigurable. It can be time-sliced and uniquely configured differently during specific operational phases of the chiplet. As mentioned in the examples above, during IC bring up it can provide insightful debug visibility into the system. During boot, it can enable secure boot and attested firmware updates to all ICs in the chiplet. During run time, it can perform cryptographic functions as well independently manage a secure enclave environment. eFPGA is also perfect for any other software acceleration your applications need, as its heavily parallel and pipelined nature is perfect for complex signal processing tasks. Lastly, during an RMA process it can also investigate and determine system failures. This is just a short list of the features eFPGA IP can enable in a chiplet.

Customizable for the perfect solution

Flex Logix EFLX IP delivers excellent PPA (Power, Performance and Area) and is available on the most advanced nodes, including Intel 18A and TSMC 7nm and 5nm. Furthermore, Flex Logix eFPGA IP is scalable – enabling you to choose the best balance of programmable logic, embedded memory and signal processing resources.

Fig. 3: Scalable Flex Logix eFPGA IP.

Want to learn more about Flex Logix IP? Contact us at [email protected] or visit our website https://flex-logix.com.

The post Overcoming Chiplet Integration Challenges With Adaptability appeared first on Semiconductor Engineering.

UCIe And Automotive Electronics: Pioneering The Chiplet Revolution

The automotive industry stands at the brink of a profound transformation fueled by the relentless march of technological innovation. Gone are the days of the traditional, one-size-fits-all system-on-chip (SoC) design framework. Today, we are witnessing a paradigm shift towards a more modular approach that utilizes diverse chiplets, each optimized for specific functionalities. This evolution promises to enhance the automotive system’s flexibility and efficiency and revolutionize how vehicles are designed, built, and operated.

At the heart of this transformation lies the Universal Chiplet Interconnect Express (UCIe), a groundbreaking standard introduced in March 2022. UCIe is designed to drastically simplify the integration process across different chiplets from various manufacturers by standardizing die-to-die connections. This initiative caters to a critical need within the industry for a modular and scalable semiconductor architecture, thus setting the stage for unparalleled innovation in automotive electronics.

Understanding the core benefit of UCIe

UCIe isn’t merely about facilitating smoother communication between chiplets; it’s a visionary standard that ensures interoperability, reduces design complexity and costs, and, crucially, supports the seamless incorporation of chiplets into cohesive packages. Developed through the collaborative efforts of the UCIe consortium, this standard enables the use of chiplets from different vendors, thereby fostering a highly customizable and scalable solution. This mainly benefits the automotive sector, where high performance, reliability, and efficiency demand is paramount.

The pivotal role of UCIe in automotive electronics

Implementing UCIe within automotive electronics unlocks a plethora of advantages. Foremost among these is the ability to design more compact, powerful, and energy-efficient electronic systems. UCIe offers savings in energy per bit compared to other serial or parallel interfaces.

Given the increasing complexity and functionality of modern vehicles — which can be likened to data centers on wheels — the modular design principle of UCIe is invaluable. It facilitates the addition of new functionalities and ensures that automotive electronics can seamlessly adapt to and incorporate emerging technologies and standards.

Secondly, the adoption of UCIe marks a significant stride toward sustainable electronic design within the automotive industry. By promoting the reuse and integration of chiplets across different platforms and vehicle models, UCIe significantly mitigates electronic waste and streamlines the lifecycle management of electronic components. This benefits manufacturers regarding cost and efficiency and aligns with broader industry trends focused on sustainability and environmental stewardship.

Offerings and solutions

Cadence offers a range of Intellectual Properties (IPs), Electronic Design Automation (EDA) tools, and 3D Integrated Circuits (ICs) tailored for chiplets.

Since the beginning of UCIe, Cadence has engaged with over 100 customers and enables automotive solutions with UCIe and its working groups. The implementation features for automotive protect the mainband, and Cadence also adds protection for the sideband, particularly for the automotive industry. UCIe is being developed across multiple foundries, process nodes, and standard and advanced packaging enablement types.

Cadence has a history of chiplet experience with an interface IP portfolio, including our chiplet die-to-die interconnects, including the UCIe IP, and proprietary 40G UltraLink D2D PHY that enables SoC providers to deliver more customized solutions that offer higher performance and yields while also shortening development cycles and reducing costs through greater IP reuse. Cadence’s UCIe PHY solutions facilitate high-speed communication and interoperability among chiplets. Additionally, the Cadence Simulation VIP for UCIe ensures that systems using UCIe are reliable and performant, addressing one of the industry’s key concerns.

The Cadence UCIe PHY is a high-bandwidth, low-power, and low-latency die-to-die solution that enables multi-die system in package integration for high-performance compute, AI/ML, 5G, automotive and networking applications. The UCIe physical layer includes link initialization, training, power management states, lane mapping, lane reversal, and scrambling. The UCIe controller includes the die-to-die adapter layer and the protocol layer. The adapter layer ensures reliable transfer through link state management, protocol, and flit formats parameter negotiation. The UCIe architecture supports multiple standard protocols such as PCIe, CXL and streaming raw mode.

For interface IP, we usually create various test chips using different process technologies and measure the silicon in the lab to prove that our controller and PHY IPs fully comply with the standard. Hence, we designed a test chip with seven chiplets connected via UCIe over different interconnect distances. To learn more, click here.

The Cadence Verification IP (VIP) for Universal Chiplet Interconnect Express (UCIe) is designed for easy integration in test benches at the IP, system-on-chip (SoC), and system level. The VIP for UCIe runs on all simulators and supports SystemVerilog and the widely adopted Universal Verification Methodology (UVM). This enables verification teams to reduce the time spent on environment development and redirect it to cover a larger verification space, accelerate verification closure, and ensure end-product quality. With a layered architecture and powerful callback mechanism, verification engineers can verify UCIe features at each functional layer (PHY, D2D, Protocol) and create highly targeted designs while using the latest design methodologies for random testing to cover a larger verification space. The VIP for UCIe can be used as a standalone stack or layered with PCIe VIP.

Challenges and future prospects

The transition to chiplet-based designs and the widespread adoption of the UCIe standard are not without their challenges. Critical concerns include functional safety, reliability, quality, security, thermal management, and mechanical stress. Hence, ensuring dependable high-speed interconnects between chiplets necessitates ongoing research and development. Successful implementation also hinges on industry-wide collaboration and establishing a robust chiplet ecosystem that supports the UCIe standard.

UCIe stands as a pioneering innovation poised to redefine automotive electronics. By offering a scalable and flexible framework, UCIe promises to enhance in-vehicle electronic systems’ performance, efficiency, and versatility and marks a significant milestone in the automotive industry’s evolution. As we look to the future, the impact of UCIe in the automotive sector is poised to be profound, turning vehicles into dynamic platforms that continuously adapt to technological advancements and user needs. The realization of modular, customizable designs underscored by efficiency, performance, and innovation heralds the dawn of a new era in semiconductor development, one where the possibilities are as boundless as the technological horizon.

The post UCIe And Automotive Electronics: Pioneering The Chiplet Revolution appeared first on Semiconductor Engineering.

Sensor Fusion Challenges In Automotive

The number of sensors in automobiles is growing rapidly alongside new safety features and increasing levels of autonomy. The challenge is integrating them in a way that makes sense, because these sensors are optimized for different types of data, sometimes with different resolution requirements even for the same type of data, and frequently with very different latency, power consumption, and reliability requirements. Pulin Desai, group director for product marketing, management and business development at Cadence, talks about challenges with sensor fusion, the growing importance of four-dimensional sensing, what’s needed to future-proof sensor designs, and the difficulty of integrating one or more software stacks with conflicting requirements.

The post Sensor Fusion Challenges In Automotive appeared first on Semiconductor Engineering.

Electromigration Concerns Grow In Advanced Packages

The incessant demand for more speed in chips requires forcing more energy through ever-smaller devices, increasing current density and threatening long-term chip reliability. While this problem is well understood, it’s becoming more difficult to contain in leading-edge designs.

Of particular concern is electromigration, which is becoming more troublesome in advanced packages with multiple chiplets, where various bonding and interconnect schemes create abrupt changes in materials and geometries. For example, electrons may travel from a copper trace to a solder bump of SAC (tin-silver-copper), then to an underbump metal based on nickel, and finally to an interposer copper pad. That, in turn, can cause atoms to shift, resulting in failures in solder joints or in copper redistribution layers in high-density fan-out packages.

“From an electromigration perspective, advanced packaging causes increased packaging density, reduced packaging size, and the dimensions of interconnects to shrink, so the current density is now in close proximity to the maximum current density limit per EM design rules,” said Dermott Lynch, director of technical product management in Synopsys‘ EDA Group.

Any additional stresses the package may be subjected to during assembly and use, whether mechanical or thermal, also can help induce or accelerate electromigration. “Electromigration, in general, gets worse due to temperature and stress, both of which advanced packaging increases,” said Lynch. “Electromigration is also cumulative, so essentially it integrates all the temperature highs and stress over the lifetime until an interconnect breaks down or shorts. Larger processing temperature and operation temperature will make it worse, but it also depends on time under that temperature.”

In fact, managing thermal pathways is perhaps the greatest challenge associated the movement toward the ultimate package, a 3D-IC. “Electromigration is very temperature-sensitive,” said Marc Swinnen, director of product marketing in Ansys’ Semiconductor Division. “Depending on your thermal map, your power integrity will have to adapt to the local temperature profile that you have. So when you look at a chip, you can calculate how much power the chip is putting out, but you cannot tell how hot the chip will get because ‘it depends.’ Is it sitting on a cold plate or sitting in the sun in the Sahara? System concerns come in, and multi-physics modeling is important to understanding these co-dependent effects.”

Thermal engineering also means moving heat away from the most vulnerable points of failure, such as solder bumps. “Effective thermal management is essential for bump reliability,” said Curtis Zwenger, vice president of engineering and technical marketing at Amkor. “Engineers are incorporating thermal enhancement techniques, such as the use of thermal interface materials and advanced heat dissipation solutions, to ensure that bumps are not subjected to excessive temperature-related stresses.”

Zwenger noted that engineers are looking into new materials, while optimizing the use of existing materials to minimize the possibility of electromigration. “Semiconductor packaging engineers are implementing a range of measures to enhance bump reliability and maximize bump yield. These strategies include new materials for solder bumps and underbump metallization, optimizing bump size, pitch and shape for reliability, advanced process control methods to control variability and maximize yield, and simulating and modeling reliability.”

What is electromigration?
Electromigration is the mass transport of metal atoms caused by the electron wind from current flowing through a conductor, typically copper. When current density is high enough, metal will diffuse in the direction of current flow, creating tiny hillocks downstream and leaving behind vacancies or voids. With enough electromigration, failures occur due to severe line thinning, causing opens, or due to hillocks that bridge adjacent lines, causing short circuits.

Electromigration is a diffusion-controlled mechanism that can take three forms — bulk, grain boundary, or surface diffusion, depending on the metal. Aluminum migrates by grain boundary diffusion whereas copper migrates on the surface or at its grain boundaries.

For most of the semiconductor industry’s history, electromigration was primarily an on-chip concern, but on-chip EM is largely under control by reliability engineers. But with the scaling and rapid developments in advanced packaging — implementing TSVs, fan-out packaging with redistribution layers, and copper pillar bumps — electromigration has emerged as a major threat at the package level. Current flowing through the solder bump causes joule heating, and heat from other parts of the package may also dissipate through the solder bumps. EM can become an issue for solder joint connections between chip and interposer, or chip and PCB, as well as in RDLs. Solder joint failures typically manifest as voids or cracks.

Fig. 1: Electromigration can create short circuits between two interconnects through the development of hillocks, or an open circuit through the creation of voids in interconnect. Source: Ansys

Fig. 1: Electromigration can create short circuits between two interconnects through the development of hillocks, or an open circuit through the creation of voids in interconnect. Source: Ansys

Electromigration progresses more quickly at higher temperatures, at higher currents, under greater mechanical stress and in the presence of defects or impurities in the metal. Black’s equation describes an interconnect’s mean time-to-failure with respect to its temperature, current density and the activation energy needed to dislodge a metal atom as:

Black's equation

J is the current density, k is Boltzmann’s constant, T is temperature, Ea is the activation energy, and N is a scaling factor that depends on the metal’s properties. Black’s equation is useful because it easily shows how shorter, wider interconnects will tend to have longer MTTF. In addition, electromigration time-to-failure very strongly depends on the interconnect’s temperature. That temperature is primarily the result of the chip’s environmental temperature, self-heating of the conductor caused by current flow, the heat from neighboring interconnects or transistors, and the thermal conductivity of the surrounding material.

It is also important to note that electromigration is a runaway process. As current density and/or temperature increases, electromigration increases, which raises current density, causing more metal to migrate in a destructive feedback loop.

EM failure modes and allowable current density
In the case of copper redistribution layers in polyimide material, as current flows through the RDL, heat accumulates in the conductor due to Joule heating generation, which can degrade performance. As the required current density and Joule heating temperature is increasing in the fine-line Cu RDL structures (<5nm lines and spaces), self-heating is considered a key factor in the reliability of high-density fan out packages.

JiHye Kwon, senior manager of R&D at Amkor, recently used EM testing and Black’s equation to determine the electromigration failure mechanisms for a given RDL stack and high-density fan-out package with 2µm or 10µm wide RDL layers, 1,000µm long. [1]

High density fan-out is an emerging technology, as it features more aggressive scaling than wafer level fan-out packages. The three layers of copper RDL (3µm thick with Ta/Cu seed) were fabricated followed by polyimide fill, copper pillar deposition, die attach, and overmold. Kwon’s team tested both 2 and 10µm RDL at different current densities and temperatures until resistance increased by 100% (EM failure), but the maximum allowed current density corresponded with a 20% resistance increase. The failure modes occurred in two stages, first by void nucleation and growth and second with copper reduction and oxidation. The study yielded Ea and current density exponent values that can be useful in future designs of RDLs.

Meanwhile, a team of researchers from ASE recently demonstrated how susceptibility to electromigration is determined on copper pillar interconnects in flip chip quad flat no-lead (FCQFN) for high-power automotive applications. The multi-layered copper pillar bumps with a Cu/Ni/Sn1.8Ag configuration were bonded to a silver-plated copper leadframe and tested under extreme EM conditions of 10 kA/cm2 current density and temperatures of 150°C, 160°C and 180°C, while taking in-situ resistance measurements. [2] The EM failures corresponded with rapid rises in electrical resistance that corresponded with the formation of intermetallic compounds and voids at the Cu/solder interfaces. The team built an EM prediction model of interconnects based on a Black-type EM equation, following the JEDEC standard with five test conditions.

After the statistic calculation from the lifetime of samples, the ASE team determined activation energy of Cu pillar interconnects in the FCQFN package (1.12 ± 0.03 eV). The maximum current of the Cu pillar interconnects allowable lasting 10 years at a 105°C operating temperature at a 0.1% failure rate was larger than 2A for the FCQFN Cu pillar structure. “The FCQFN package has great potential in terms of its excellent anti-EM performance for future high-power applications,” the article said.

Designing/manufacturing for EM resiliency
Building electromigration resilience into advanced devices begins with using only EM-compliant linewidths in circuit designs based on the current density and heat profile that the interconnects will experience during operation over the lifetime of the device. Electromigration mitigation also requires process and materials engineering to ensure durability, for instance, of copper pillar bumps under BGA packages. It also calls for an optimized assembly process window and tight process control to prevent tiny violations of design rules that can later precipitate as EM failures.

As the industry makes its way toward true 3D packages, and eventually 3D-ICs, it seems clear that modeling and simulation will play an increasing role in determining many of the guard rails for manufacturing and assembly before manufacturing and assembly even begins. “Reliability modeling and simulation tools are being used to better understand the reliability of bump structures. This proactive approach helps in identifying potential issues before they arise, enabling engineers to implement preventive measures,” said Zwenger.

Modeling and simulation at the system level also will be essential to understanding the complex interplay between reliability mechanisms with thermal and mechanical stress in multi-chiplet systems during operation.

“Electromigration for stacked die is challenging,” said Synopsys’ Lynch. “Localized, die-to-die workloads cause repetitive current flow in specific areas. This generates local heat, increasing EM resulting in wire degradation, while producing even more heat. Reducing the thermal issue becomes critical to ensuring EM reliability.”

As stated previously, solder bumps can become a site for EM reliability failure. “Engineers fine-tune bump design in terms of bump size, pitch, and shape to ensure uniformity and reliability across the entire package. This includes the adoption of innovative Cu bump structures for improved mechanical and electrical properties,” said Amkor’s Zwenger.

In flip-chip BGA and other flip-chip applications, underfill materials — typically thermoset epoxies — are used to reduce the thermal stresses on solder bumps. “Underfill materials play a critical role in providing mechanical support and thermal stability to the bumps,” Zwenger said. “Engineers are investing in the development of advanced underfill formulations with enhanced properties, such as improved adhesion, thermal conductivity, and stress relief.”

Conclusion
Because of its dependence on temperature, electromigration is a failure mechanism to watch and plan for as devices continue to scale and systems integrators continue to cram more and more chiplets of various functions into advanced packages.

“In advanced technologies, the current density is now in close proximity to the maximum density,” said Synopsys’ Lynch. “Anything that causes an increase in temperature poses a threat. Designers of multi-die systems need to understand the impact of temperature and design systems to remove the heat.”

References

  1. JiHye Kwon, “Electromigration Performance Of Fine-Line Cu Redistribution Layer (RDL) For HDFO Packaging,” Semiconductor Engineering, Jan. 18, 2024, https://semiengineering.com/electromigration-performance-of-fine-line-cu-redistribution-layer-rdl-for-hdfo-packaging/
  2. -Y. Tsai, et al., “An Electromigration Study of Cu Pillar Interconnects in Flip-chip QFN Packaging under Extreme Conditions for High-power Applications,” 2023 IEEE 25th Electronics Packaging Technology Conference (EPTC), Singapore, 2023, pp. 326-332, doi: 10.1109/EPTC59621.2023.10457564.

Related Reading
What Can Go Wrong In Heterogeneous Integration
Workflows and tools are disconnected, mechanical stress is ill-defined, and complete co-planarity is nearly impossible. But there are solutions on the horizon.
Thermal Integrity Challenges Grow In 2.5D
Work is underway to map heat flows in interposer-based designs, but there’s much more to be done.
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.

The post Electromigration Concerns Grow In Advanced Packages appeared first on Semiconductor Engineering.

What Works Best For Chiplets

The semiconductor industry is preparing for the migration from proprietary chiplet-based systems to a more open chiplet ecosystem, in which chiplets fabricated by different companies of various technologies and device nodes can be integrated in a single package with acceptable yield.

To make this work as expected, the chip industry will have to solve a variety of well-documented technical and business issues, and it will have to rein in some of the grander visions of what’s possible — at least initially. The basic challenge is aligning domain-specific performance demands of end systems, which contain a growing number of chiplets, with the assembly and packaging capabilities and methodologies of IDMs, foundries, and OSATs. This includes the creation of assembly development kits (ADKs) that are roughly the equivalent of process development kits (PDKs), which today are codified with manufacturing specifications.

A PDK provides the appropriate level of detail needed to develop planar chips, marrying design tools with fab processes to achieve a predictable outcome. But making this work for an ADK with heterogeneous chiplets is many times more complex. Design and assembly teams need to manage thermal, mechanical, and electrical co-dependencies that cause electrical and mechanical stress, resulting in warpage, reduced yield, and reliability issues under real-world workloads. Layered on top of this the business and legal issues related to packaging of different devices from different manufacturers.

“Chiplets are a growing trend, especially in the HPC and networking segments, with potential to scale to other applications,” said Gabriela Pereira, technology and market analyst for semiconductor packaging at Yole Intelligence. “The industry has understood that high-end advanced packaging technologies are needed to connect them — but that’s much more complex than it seems. Connecting chiplets requires the design of high-bandwidth interconnections at the package level, which can take different forms — e.g., 2D, 2.5D or 3D — while ensuring that the thermal and power requirements are fulfilled.”

Commercial chiplet-based devices generally are domain-specific, and sometimes developed for a specific workload. So despite a big industry push to create a LEGO-like mix-and-match ecosystem for chiplets — which today includes multiple IP and EDA vendors, foundries, memory suppliers, OSATs, substrate suppliers, etc. — making this work as planned will require time and a massive amount of work.

Fig. 1: System assembly requires tighter coupling between chipmakers and OSATs. Source: ASE

Fig. 1: System assembly requires tighter coupling between chipmakers and OSATs. Source: ASE

In creating heterogeneous integrated designs, it’s essential to have much tighter collaboration between foundries, IDMs, OSATs, and PCB manufacturers. And because each chiplet-based system will be customized, the number of assembly processes will grow substantially. For example, one OSAT noted that among its ~5,000 customers, there are ~1,000 different assembly processes.

That diversity in products and processes makes it difficult to achieve predictable results by choosing chiplets from a large menu of options.

“We’ve already encountered a lot of limitations including not only the silicon, but also integration and the ecosystem,” said Lihong Cao, senior director at ASE Group, at MEPTEC’s Road to Chiplets forum. She stressed that customers continue to push for a low-cost chiplet assembly process, which is creating constructive tension between developing a sophisticated assembly process and the economic realities of different industry sectors. Computing devices for automotive have a higher cost sensitivity than for data centers, for example, but their chips operate in a harsher environment over a longer lifetime.

What’s needed is a defined set of assembly process recipes — basically, a highly limited menu of choices — that are specific to the end application (HPC, automotive, RF telecommunications) in order to lower the cost of chiplet-based systems. OSATs and foundries already are moving in that direction for high-performance computing. For example, at its 2024 Direct Connect event, Intel shared its six different package processes for chiplets. TSMC and Samsung also offer defined sets of chiplet processes. But the success of these assembly processes requires engineering teams to co-optimize the flows, processes, and materials to best match the system requirements.

Fig. 2: Integrated platform development requires tightly coupled architectural analysis that co-optimizes the system design to architecture to assembly process and packaging material selections. Source: Applied Materials

Fig. 2: Integrated platform development requires tightly coupled architectural analysis that co-optimizes the system design to architecture to assembly process and packaging material selections. Source: Applied Materials

“Previously, when we designed a system we only had to be worried about the system requirements. Once we start segregating into dies and reassembling them, we have to start looking at other things. We have to worry about putting them together while considering signal integrity between dies, reliability, thermals, etc.,” said Itai Leshniak, director of AI systems solutions at Applied Materials, at the MEPTEC forum. “If we take the case of AI-based computer vision, we can break it down layer by layer — on the hardware side, determining which computer vision processors, sensors, filters are needed to break it down into the architecture at layer. Then we begin to go through how to package all these chiplets, and then which materials to use and how to take advantage of those materials.”

Materials and assembly processes
Conceptually, design engineers will use chiplets to design a system. However, the co-design and integration is far more complicated than assembling a set of LEGO blocks, because the chiplets, interposers, and package substrates come from different design houses and manufacturing facilities. The advanced packaging technologies used to connect chiplets vary with an alphabet soup of names — FOWLP, FOPLP, CoWoS, etc., each of which poses additional design and material choices along with certain process limitations.

Fig. 3: There are a multitude of choices in multi-die packaging from the high-level layout to substrates, materials, bonding methods, and cooling materials. Source: Synopsys

Fig. 3: There are a multitude of choices in multi-die packaging from the high-level layout to substrates, materials, bonding methods, and cooling materials. Source: Synopsys

Currently engineering teams determine the tradeoffs among the different packaging options to select materials, derive a process recipe, and determine design rules.

Materials are a good starting point. “Materials are very important because they enable new products and packaging technologies,” Tanja Braun, deputy group manager at the Fraunhofer Institute for Reliability and Microintegration IZM. “As you move into more advanced packaging, process is getting much more complex because you are putting more things together. In the end, it’s a combination of equipment, materials, and process development.”

There are three thermal parameters that are critical in package assembly processes — coefficients of thermal expansion (CTE), glass transition temperature (Tg), and thermal conductivity. These factors affect how a material behaves in manufacturing to packaging processes, as well as how it behaves in the field.

“Challenges for our materials include temperature limitations of different die,” said Rama Puligadda, CTO at Brewer Science. “We have to ensure that the temperatures used for bonding materials don’t exceed the thermal limitations of any of the chips that are being integrated into the package. Additionally, there may be some subsequent processes like redistribution layer (RDL) formation or molding. Our materials have to survive those processes. They have to survive the chemicals they come in contact with throughout the packaging process scheme. Mechanical stresses in the package add additional challenges for bonding materials.”

Within a stack of chiplets-on-substrate with an optional interposer, their material attributes affect the thermal-mechanical stresses between neighboring materials, as well. This directly impacts interconnect dimensional control over a large area substrate area.

“If you go work the numbers, you will find that the level of tolerance and control required is frightening,” said Dick Otte, CEO of Promex Industries. “You’re talking about controlling dimensions equivalent to the width of a grass blade over the length of a football field, so that’s roughly 1 in 100,000.”

The goal is uniform heating of the structure in reflow in order to attain the best process results and to avoid cracking. “When you’re taking it through a 250 degrees centigrade temperature change, then you need to heat up slowly so that the top doesn’t get hot before the bottom does,” said Otte.

Multi-physics to comprehend co-optimization
Multi-physics modeling has become the go-to method for co-optimizing packaging design and assembly process development. That affects both permanent and temporary materials, as well the placement of processors, memories, and other components.

“You always looking to what the customer needs electrically, because that’s going to help define the material set. The material set is broadly applicable to a bunch of speed ranges. As long as you don’t step outside of those electrical specifications, theoretically you should be okay,” said Mike Kelly, vice president of advanced package and technology integration at Amkor Technology.

To save many iterations of empirically based development, engineers can use physics-based simulations to understand the impact of a material set’s properties impact on the assembly process, power/thermals, and mechanical vibrations.

Consider that HPC chiplet products can consume ~1,000 watts at peak performance so the power and thermal interactions need to be fully understood.

We’ve struggled, as everybody has, with this blizzard of complexity in the different techniques. Not only do they vary across different vendors, but they’re also varying over time,” said Marc Swinnen, director of product marketing at Ansys. “Our approach has been to identify the essentials that need to be worked on. We work jointly with customers to develop a simulation flow that actually achieves what is needed now.”

Materials are just one piece of the puzzle. “Then there’s the assembly stresses that need to be modeled to know whether you can correctly assemble this device. The third one is mechanical vibration,” Swinnen said. “Can your device withstand those regular vibrations? Modeling these attributes ties directly into our mechanical analysis tools — acoustic, thermal, vibration, etc. In the end, you’re going to have to do physics simulation. We’re trying to make it accessible to people in many different forms. But the bedrock of our tool offerings is that we have the meshing simulation and analysis. It’s a question of getting the data in the right format in a way that’s practical and usable.”

Evolving assembly design kits
For conventional packages, OSATs provide design rules for each packaging technology. These need to consider electrical, mechanical and thermal design requirements and manufacturing process limitations. In effect this is a multi-dimensional bounding box. Suppliers perform iterations with the customer to create a product specific process recipe.

Rules cover the macro-level attributes. “At a minimum, what you see from design rules is maximum package size, maximum silicon size, and whether silicon can be [mounted] on both sides of the substrate, such that when you follow these constructions the final product will have a lifetime of 1,000 thermal cycles, for example,” said Fraunhofer’s Braun.

In addition, design rules need to describe routing constraints for the interposer and/or redistribution layer, such as RDL line widths and spaces, ball-grid/pillar/pad size and pitches, and the maximum number of interconnections.

Breaking up a monolithic HPC device into multiple dies shifts some of the semiconductor design/process complexity into the packaging space. That makes things much more complicated. Consider that to connect 10 dies requires on order of 100,000 traces within the interposer’s or substrate’s redistribution layer.

To cope with the complexity at the chip level, the IC industry has long relied upon process design kits (PDKs) to capture design rules in an electronic file that can be imported into EDA tools. Their counterparts, assembly design kits (ADKs), are relatively immature.

“We call it Smart Package,” said Amkor’s Kelly. “It’s an ADK that we give to every customer who’s doing their own design. It is a set of macros, and a customization of a database tailored to a customer’s particular design. For chiplets, it is a high-density fan-out package technology. And it’s cognizant of the limitations for metal density and metal spacing, etc. This makes it easier for us to do design rule checks (DRCs).”

But right now, with the level of customization still required, how an ADK is derived and what it entails is in flux. Partnerships between EDA tool vendors, OSATs, and semiconductor device providers are required.

“We come from the IC world where everything is very rigid,” said Kenneth Larsen, director of 3D-IC product management in Synopsys‘ EDA Group. “On the OSAT side, and maybe this is because it’s so custom, design rules seem like a data sheet. Then you build and optimize the products over time or in collaboration with the OSAT. It’s not an electronic exchange. In the IC world, this would be totally unheard of. While it is possible to tweak a few things, you have a qualification process. And it seems like that’s not there yet for packaging.”

Materials and associated assembly recipes ultimately drive what’s possible for a chiplet-substrate stack in terms of pillar pitch, RDL line widths and spaces, bonding processes, and chiplet placement tolerances. But within a handful of ADKs, there are many possible interactions to consider.

The current focus is on co-optimizing the system design with the chiplet assembly process, leading to an assembly process development flow (see figure 4). This flow considers the needs of customization of an assembly process, and it creates the necessary design rules to be used by package designers.

Fig. 4: Chip-package hybrid flow. Source: ASE

Fig. 4: Chip-package hybrid flow. Source: ASE

“First you need to define your structure using chiplets. Are you using substrate RDL, 2.5D RDL, or a bridge? After that you need to consider your structure’s materials. What kind of material do you choose to fulfill your electrical performance and the mechanical stress requirements,” said Cao. “After that, you do pre-analysis to ensure all the structures and materials you use are workable in terms of electrical, warpage and mechanical stress.”

The design planning flow also includes the evaluation of die-to-die interconnects through the documents for co-design sign-off.

Conclusion
Before chiplet-based designs can be enabled outside the IDM model, the industry needs to complete the ecosystem that bridges the manufacturing and design complexity. This is because the need to co-optimize the system architecture based on materials, process, and integration capabilities is essential. While this would be easier with a set of well-defined products for the chiplet ecosystem to drive forward on, that has not happened yet.

Engineering teams across the design and manufacturing stack will need to collaborate to choose the appropriate materials, architectures, processes, etc., to develop a final chiplet-based product that is designable. As ASE group’s Cao noted, “An integrated design and manufacturing ecosystem is important. It is very critical to have collaboration among IDM, vendors, materials suppliers. Everyone needs to work together to really enable integration for the real applications.”

Related stories
Fan-Out Packaging Gets Competitive
Manufacturability reaches sufficient level to compete with flip-chip BGA and 2.5D.

Inside Panel-Level Fan-Out Technology
Fraunhofer’s panel experts dig into why this approach is needed and where the challenges are to making it work.

Next Steps For Panel-Level Packaging
Where it’s working, and what challenges remain for even broader adoption.

Mini-Consortia Forming Around Chiplets
Commercial chiplet marketplaces are still on the distant horizon, but companies are getting an early start with more limited partnerships.

What Can Go Wrong In Heterogeneous Integration
Workflows and tools are disconnected, mechanical stress is ill-defined, and complete co-planarity is nearly impossible. But there are solutions on the horizon.

Mechanical Challenges Rise With Heterogeneous Integration
But gaps in tools make it difficult to address warpage, structural issues, and new materials in multi-die/multi-chiplet designs.

Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.

The post What Works Best For Chiplets appeared first on Semiconductor Engineering.

Powering The Automotive Revolution: Advanced Packaging For Next-Generation Vehicle Computing

Automotive processors are rapidly adopting advanced process nodes. NXP announced the development of 5 nm automotive processors in 2020 [1], Mobileye announced EyeQ Ultra using 5 nm technology during CES 2022 [2], and TSMC announced its “Auto Early” 3 nm processes in 2023 [3]. In the past, the automotive industry was slow to adopt the latest semiconductor technologies due to reliability concerns and lack of a compelling need. Not anymore.

The use of advanced processes necessitates the use of advanced packaging as seen in high performance computing (HPC) and mobile applications because [4][5]:

  1. While transistor density has skyrocketed, I/O density has not increased proportionally and is holding back chip size reductions.
  2. Processors have heterogeneous, specialized blocks to support today’s workloads.
  3. Maximum chip sizes are limited by the slowdown of transistor scaling, photo reticle limits and lower yields.
  4. Cost per transistor improvements have slowed down with advanced nodes.
  5. Off-package dynamic random-access memory (DRAM) throttles memory bandwidth.

These have been drivers for the use of advanced packages like fan-out in mobile and 2.5D/3D in HPC. In addition, these drivers are slowly but surely showing up in automotive compute units in a variety of automotive architectures as well (see figure 1).

Fig. 1: Vehicle E/E architectures. (Image courtesy of Amkor Technology)

Vehicle electrical/electronic (E/E) architectures have evolved from 100+ distributed electronic control units (ECUs) to 10+ domain control units (DCUs) [6]. The most recent architecture introduces zonal or zone ECUs that are clustered in physical locations in cars and connect to powerful central computing units for processing. These newer architectures improve scalability, cost, and reliability of software-defined vehicles (SDVs) [7]. The processors in each of these architectures are more complex than those in the previous generation.

Multiple cameras, radar, lidar and ultrasonic sensors and more feed data into the compute units. Processing and inferencing this data require specialized functional blocks on the processor. For example, the Tesla Full Self-Driving (FSD) HW 3.0 system on chip (SoC) has central processing units (CPUs), graphic processing units (GPUs), neural network processing units, Low-Power Double Data Rate 4 (LPDDR4) controllers and other functional blocks – all integrated on a single piece of silicon [8]. Similarly, Mobileye EyeQ6 has functional blocks of CPU clusters, accelerator clusters, GPUs and an LPDDR5 interface [9]. As more functional blocks are introduced, the chip size and complexity will continue to increase. Instead of a single, monolithic silicon chip, a chiplet approach with separate functional blocks allows intellectual property (IP) reuse along with optimal process nodes for each functional block [10]. Additionally, large, monolithic pieces of silicon built on advanced processes tend to have yield challenges, which can also be overcome using chiplets.

Current advanced driver-assistance systems (ADAS) applications require a DRAM bandwidth of less than 60GB/s, which can be supported with standard double data rate (DDR) and LPDDR solutions. However, ADAS Level 4 and Level 5 will need up to 1024 GB/s memory bandwidth, which will require the use of solutions such as Graphic DDR (GDDR) or High Bandwidth Memory (HBM) [11][12].

Fig. 2: Automotive compute package roadmap. (Image courtesy of Amkor Technology)

Automotive processors have been using Flip Chip BGA (FCBGA) packages since 2010. FCBGA has become the mainstay of several automotive SoCs, such as EyeQ from Mobileye, Tesla FSD and NVIDIA Drive. Consumer applications of FCBGA packaging started around 1995 [13], so it took more than 15 years for this package to be adopted by the automotive industry. Computing units in the form of multichip modules (MCMs) or System-in-Package (SiP) have also been in automotive use since the early 2010s for infotainment processors. The use of MCMs is likely to increase in automotive compute to enable components like the SoC, DRAM and power management integrated circuit (PMIC) to communicate with each other without sending signals off-package.

As cars move to a central computing architecture, the SoCs will become more complex and run into size and cost challenges. Splitting these SoCs into chiplets becomes a logical solution and packaging these chiplets using fan-out or 2.5D packages becomes necessary. Just as FCBGA and MCMs transitioned into automotive from non-automotive applications, so will fan-out and 2.5D packaging for automotive compute processors (see figure 2). The automotive industry is cautious but the abovementioned architecture changes are pushing faster adoption of advanced packages. Materials, processes, and factory controls are key considerations for successful qualification of these packages in automotive compute applications.

In summary, the automotive industry is adopting advanced semiconductor technologies, such as 5 nm and 3 nm processes, which require the use of advanced packaging due to limitations in I/O density, chip size reductions, and memory bandwidth. Processors in the latest vehicle E/E architectures are more complex and require specialized functional blocks to process data from multiple sensors. As cars move to the central computing architecture, the SoCs will become more complex and run into size and cost challenges. Splitting these SoCs into chiplets becomes a logical solution and packaging these chiplets using fan-out or 2.5D technology becomes necessary.

Sources

  1. NXP. “NXP Selects TSMC 5nm Process for Next-Generation High-Performance Automotive Platform.” NXP, https://www.nxp.com/company/about-nxp/nxp-selects-tsmc-5nm-process-for-next-generation-high-performance-automotive-platform:NW-TSMC-5NM-HIGH-PERFORMANCE.
  2. Mobileye. “Mobileye at CES 2022.” Mobileye, https://www.mobileye.com/news/mobileye-ces-2022-tech-news/.
  3. Business Wire. “TSMC Showcases New Technology Developments at 2023 Technology Symposium.” Business Wire, https://www.businesswire.com/news/home/20230426005359/en/TSMC-Showcases-New-Technology-Developments-at-2023-Technology-Symposium.
  4. Swaminathan, Raja. “Advanced Packaging: Enabling Moore’s Law’s Next Frontier Through Heterogeneous Integration.” HotChips33, https://hc33.hotchips.org/assets/program/tutorials/2021%20Hot%20Chips%20AMD%20Advanced%20Packaging%20Swaminathan%20Final%20%2020210820.pdf
  5. SemiAnalysis. “Advanced Packaging Part 1” SemiAnalysis, https://www.semianalysis.com/p/advanced-packaging-part-1-pad-limited?utm_source=%2Fsearch%2Fadvanced%2520packaging&utm_medium=reader2.
  6. McKinsey & Company. “Getting Ready for Next-Generation EE Architecture with Zonal Compute.” McKinsey & Company, https://www.mckinsey.com/industries/semiconductors/our-insights/getting-ready-for-next-generation-ee-architecture-with-zonal-compute.
  7. NXP. “How Zonal E/E Architectures with Ethernet are Enabling Software-Defined Vehicles.” NXP, https://www.nxp.com/company/blog/how-zonal-e-e-architectures-with-ethernet-are-enabling-software-defined-vehicles:BL-HOW-ZONAL-EE-ARCHITECTURES.
  8. WikiChip. “Tesla (Car Company)/FSD Chip.” WikiChip, https://en.wikichip.org/wiki/tesla_(car_company)/fsd_chip.
  9. Mobileye. “EyeQ Chip.” Mobileye, https://www.mobileye.com/technology/eyeq-chip/.
  10. Ziadeh, Bassam. “Driving Adoption of Advanced IC Packaging in Automotive Applications.” Presentation at IMAPS DPC, March 2023. General Motors, Fountain Hills AZ, March 16, 2023.
  11. K Matthias Jung and Norbert Wehn. “Driving Against the Memory Wall: The Role of Memory for Autonomous Driving.” Fraunhofer IESE, Kaiserslautern, Germany, and Microelectronic Systems Design Research Group, University of Kaiserslautern, Kaiserslautern, Germany. Kluedo, https://kluedo.ub.rptu.de/frontdoor/deliver/index/docId/5286/file/_memory.pdf.
  12. Micron. “Cinco de Play: Memory – Is That Critical to Autonomous Driving?” Micron, https://www.micron.com/about/blog/2017/october/cinco-play-memory-is-that-critical-to-autonomous-driving.
  13. McKinsey & Company. “Advanced Chip Packaging: How Manufacturers Can Play to Win.” McKinsey & Company, https://www.mckinsey.com/industries/semiconductors/our-insights/advanced-chip-packaging-how-manufacturers-can-play-to-win.

The post Powering The Automotive Revolution: Advanced Packaging For Next-Generation Vehicle Computing appeared first on Semiconductor Engineering.

Advanced Packaging Design For Heterogeneous Integration

Od: CP Hung

As device scaling slows down, a key system functional integration technology is emerging: heterogeneous integration (HI). It leverages advanced packaging technology to achieve higher functional density and lower cost per function. With the continuous development of major semiconductor applications such as AI HPC, edge AI and autonomous electrical vehicles, traditional chips are transforming into smaller, well-partitioned chiplets that require chip-to-chip interconnections to be denser, faster and more reliable. This boosts the demand for heterogeneous integration, elevating demand for innovative advanced packaging technologies.

HI uses advanced packaging to integrate chiplets with heterogeneous designs and process nodes into a single package. This allows enterprises to choose optimum process nodes for specific system demands, such as 3nm for computing chiplets, 7nm for radio frequency chiplets, or to quickly produce super chips with specific functions in a cost-effective manner. HI not only aims for higher interconnection density, but also integrates various functional components, such as logic chips, sensors, memory, and others, which are needed to complete the whole system in one package. Overall energy efficiency and performance is greatly improved, while package size can be significantly reduced.

Advanced packaging solutions for AI HPC

The typical high-density advanced package size for AI cloud computing processors is 55mm x 55mm or more, and contains a 5-2-5 (top 5 layers, middle 2 layers, bottom 5 layers) advanced substrate, or even up to 11-2-11 wiring layers. Chiplets can be interconnected by fan-out technology with silicon bridge or 2.5D with Si Interposer as the integration platform. Through this technique, industry aims to gain more computing power within the same space.

ASE provides high-density packaging solutions, including Flip Chip Ball Grid Array (FCBGA), Fan Out Chip-on-Substrate (FOCoS), FOCoS-Bridge and 2.5D. The chip-to-chip interconnections in FCBGA is accomplished through BGA substrate, and its minimum L/S (line width/line spacing) is only about 10μm/10μm. The very popular and in-demand CoWoS (Chip on Wafer on Substrate) is a 2.5D packaging technology that uses RDL (redistribution layer) on Si interposer to connect chiplets, and its L/S can be significantly reduced to 0.5μm/0.5μm.

In the Si interposer of a 2.5D package, all the chiplets are connected in a side-by-side arrangement, and as the required number of chiplets increases, its area becomes larger and larger, resulting in fewer and fewer Si interposer chips that can be made from each 12-inch wafer (generally less than 50). This indeed significantly increases the manufacturing cost of 2.5D packaging. However, not all applications require 0.5μm/0.5μm L/S, so ASE came up with FOCoS, which uses fan-out technology’s RDL to integrate different chiplets, and its L/S can reach 2μm/2μm. This gives alternative solutions to the market with lower costs. In addition, ASE’s FOCoS-Bridge technology uses silicon bridge to provide high-density routing for interconnecting different chips (such as logic chips and memory) in areas that require high-speed transmission and uses Fan-Out RDL to integrate in other areas. As such, it delivers both 0.5μm/0.5μm and 2μm/2μm flexibility in L/S design, while achieving a significant increase in packaging density and bandwidth.

High performance chip-package-system co-design

To achieve the aforementioned high bandwidth, the chip, package, and entire system must be designed together to achieve holistic design optimization instead of just considering the individual parts. When using electronic design automation (EDA) for design optimization, consideration must be given to overall signal change along the entire transmission path, including Cu pillar, RDL fine line, TSV, μbump, etc. Eye diagrams can then be used to analyze the SerDes link’s electrical performance. When designing differential pairs for high-speed signals, it is necessary to reduce return and insertion loss, especially in the operating frequency band. From chip to package to the entire system, Taiwan’s manufacturing advantage lies in the ability to accomplish the turnkey design process, from beginning to end.

Providing more computing power with less energy

The industry is currently focused on optimizing energy efficiency. One of the key questions being asked is whether the power regulation and decoupling components, which were previously located on the system board, can be moved closer to the package or processor chip. There is even talk of redesigning the on-chip power delivery network (PDN), including supplying power directly from the backside of the chip (Backside PDN).

Power integrity design for power delivery network (PDN)

Optimizing power integrity and minimizing noise can be achieved by strategically positioning the capacitor. Ideally, the capacitor should be placed as close to the chip as possible, but this is dependent on the capacitor’s size and the manufacturing process, both of which can impact cost and performance. Traditional surface-mount technology (SMT) capacitors are relatively large, but chip-level silicon capacitors (Si-Cap) are now available that offer decent capacitance values.

UCIe (Universal Chiplet Interconnect Express) Consortium

Traditionally, there are many standard communication protocols (such as Block-to-Block, Memory Bus, or Interconnection Interface Protocols) at the chip level and the board level for system designers. Industry protocols that specify package-level integration are growing, especially given the need for a universal interface for chiplet integration using 2.5D and FOCoS packaging technologies.

In March 2022, Intel invited upstream and downstream manufacturers in the semiconductor industry chain to form the UCIe Consortium, and a standardized data transmission architecture for chiplet integration was introduced to reduce the cost of advanced packaging design. ASE is proud to be a founding member (Promoter member).

ASE offers a diverse range of advanced packaging types. We have developed packaging design specifications that can be integrated with foundry solutions specifications as well as the system requirements of original equipment manufacturers (OEMs) and cloud service providers to create a comprehensive UCIe package standard. The standard can assist in realizing ubiquitous chiplet heterogeneous integration for HPC applications using various advanced packaging technology architectures, such as 2.5D, 3D, FOCoS, Fan-out, EMIB, CoWoS, etc. Headquartered in Taiwan, ASE is enthusiastically participating in the formulation of international standards and relentlessly providing integrated solutions to the global industry.

Heterogeneous integration has been in development for many years. It can be used to integrate not only homogeneous and heterogeneous chiplets but also other passive and active components including connectors, into a single package. Achieving this requires not only advanced packaging technologies but also design and testing coordination. ASE offers a comprehensive one-stop service solution that includes system design, packaging, and testing to help customers shorten chip design cycles and accelerate product innovation.

The post Advanced Packaging Design For Heterogeneous Integration appeared first on Semiconductor Engineering.

Electromigration Concerns Grow In Advanced Packages

The incessant demand for more speed in chips requires forcing more energy through ever-smaller devices, increasing current density and threatening long-term chip reliability. While this problem is well understood, it’s becoming more difficult to contain in leading-edge designs.

Of particular concern is electromigration, which is becoming more troublesome in advanced packages with multiple chiplets, where various bonding and interconnect schemes create abrupt changes in materials and geometries. For example, electrons may travel from a copper trace to a solder bump of SAC (tin-silver-copper), then to an underbump metal based on nickel, and finally to an interposer copper pad. That, in turn, can cause atoms to shift, resulting in failures in solder joints or in copper redistribution layers in high-density fan-out packages.

“From an electromigration perspective, advanced packaging causes increased packaging density, reduced packaging size, and the dimensions of interconnects to shrink, so the current density is now in close proximity to the maximum current density limit per EM design rules,” said Dermott Lynch, director of technical product management in Synopsys‘ EDA Group.

Any additional stresses the package may be subjected to during assembly and use, whether mechanical or thermal, also can help induce or accelerate electromigration. “Electromigration, in general, gets worse due to temperature and stress, both of which advanced packaging increases,” said Lynch. “Electromigration is also cumulative, so essentially it integrates all the temperature highs and stress over the lifetime until an interconnect breaks down or shorts. Larger processing temperature and operation temperature will make it worse, but it also depends on time under that temperature.”

In fact, managing thermal pathways is perhaps the greatest challenge associated the movement toward the ultimate package, a 3D-IC. “Electromigration is very temperature-sensitive,” said Marc Swinnen, director of product marketing in Ansys’ Semiconductor Division. “Depending on your thermal map, your power integrity will have to adapt to the local temperature profile that you have. So when you look at a chip, you can calculate how much power the chip is putting out, but you cannot tell how hot the chip will get because ‘it depends.’ Is it sitting on a cold plate or sitting in the sun in the Sahara? System concerns come in, and multi-physics modeling is important to understanding these co-dependent effects.”

Thermal engineering also means moving heat away from the most vulnerable points of failure, such as solder bumps. “Effective thermal management is essential for bump reliability,” said Curtis Zwenger, vice president of engineering and technical marketing at Amkor. “Engineers are incorporating thermal enhancement techniques, such as the use of thermal interface materials and advanced heat dissipation solutions, to ensure that bumps are not subjected to excessive temperature-related stresses.”

Zwenger noted that engineers are looking into new materials, while optimizing the use of existing materials to minimize the possibility of electromigration. “Semiconductor packaging engineers are implementing a range of measures to enhance bump reliability and maximize bump yield. These strategies include new materials for solder bumps and underbump metallization, optimizing bump size, pitch and shape for reliability, advanced process control methods to control variability and maximize yield, and simulating and modeling reliability.”

What is electromigration?
Electromigration is the mass transport of metal atoms caused by the electron wind from current flowing through a conductor, typically copper. When current density is high enough, metal will diffuse in the direction of current flow, creating tiny hillocks downstream and leaving behind vacancies or voids. With enough electromigration, failures occur due to severe line thinning, causing opens, or due to hillocks that bridge adjacent lines, causing short circuits.

Electromigration is a diffusion-controlled mechanism that can take three forms — bulk, grain boundary, or surface diffusion, depending on the metal. Aluminum migrates by grain boundary diffusion whereas copper migrates on the surface or at its grain boundaries.

For most of the semiconductor industry’s history, electromigration was primarily an on-chip concern, but on-chip EM is largely under control by reliability engineers. But with the scaling and rapid developments in advanced packaging — implementing TSVs, fan-out packaging with redistribution layers, and copper pillar bumps — electromigration has emerged as a major threat at the package level. Current flowing through the solder bump causes joule heating, and heat from other parts of the package may also dissipate through the solder bumps. EM can become an issue for solder joint connections between chip and interposer, or chip and PCB, as well as in RDLs. Solder joint failures typically manifest as voids or cracks.

Fig. 1: Electromigration can create short circuits between two interconnects through the development of hillocks, or an open circuit through the creation of voids in interconnect. Source: Ansys

Fig. 1: Electromigration can create short circuits between two interconnects through the development of hillocks, or an open circuit through the creation of voids in interconnect. Source: Ansys

Electromigration progresses more quickly at higher temperatures, at higher currents, under greater mechanical stress and in the presence of defects or impurities in the metal. Black’s equation describes an interconnect’s mean time-to-failure with respect to its temperature, current density and the activation energy needed to dislodge a metal atom as:

Black's equation

J is the current density, k is Boltzmann’s constant, T is temperature, Ea is the activation energy, and N is a scaling factor that depends on the metal’s properties. Black’s equation is useful because it easily shows how shorter, wider interconnects will tend to have longer MTTF. In addition, electromigration time-to-failure very strongly depends on the interconnect’s temperature. That temperature is primarily the result of the chip’s environmental temperature, self-heating of the conductor caused by current flow, the heat from neighboring interconnects or transistors, and the thermal conductivity of the surrounding material.

It is also important to note that electromigration is a runaway process. As current density and/or temperature increases, electromigration increases, which raises current density, causing more metal to migrate in a destructive feedback loop.

EM failure modes and allowable current density
In the case of copper redistribution layers in polyimide material, as current flows through the RDL, heat accumulates in the conductor due to Joule heating generation, which can degrade performance. As the required current density and Joule heating temperature is increasing in the fine-line Cu RDL structures (<5nm lines and spaces), self-heating is considered a key factor in the reliability of high-density fan out packages.

JiHye Kwon, senior manager of R&D at Amkor, recently used EM testing and Black’s equation to determine the electromigration failure mechanisms for a given RDL stack and high-density fan-out package with 2µm or 10µm wide RDL layers, 1,000µm long. [1]

High density fan-out is an emerging technology, as it features more aggressive scaling than wafer level fan-out packages. The three layers of copper RDL (3µm thick with Ta/Cu seed) were fabricated followed by polyimide fill, copper pillar deposition, die attach, and overmold. Kwon’s team tested both 2 and 10µm RDL at different current densities and temperatures until resistance increased by 100% (EM failure), but the maximum allowed current density corresponded with a 20% resistance increase. The failure modes occurred in two stages, first by void nucleation and growth and second with copper reduction and oxidation. The study yielded Ea and current density exponent values that can be useful in future designs of RDLs.

Meanwhile, a team of researchers from ASE recently demonstrated how susceptibility to electromigration is determined on copper pillar interconnects in flip chip quad flat no-lead (FCQFN) for high-power automotive applications. The multi-layered copper pillar bumps with a Cu/Ni/Sn1.8Ag configuration were bonded to a silver-plated copper leadframe and tested under extreme EM conditions of 10 kA/cm2 current density and temperatures of 150°C, 160°C and 180°C, while taking in-situ resistance measurements. [2] The EM failures corresponded with rapid rises in electrical resistance that corresponded with the formation of intermetallic compounds and voids at the Cu/solder interfaces. The team built an EM prediction model of interconnects based on a Black-type EM equation, following the JEDEC standard with five test conditions.

After the statistic calculation from the lifetime of samples, the ASE team determined activation energy of Cu pillar interconnects in the FCQFN package (1.12 ± 0.03 eV). The maximum current of the Cu pillar interconnects allowable lasting 10 years at a 105°C operating temperature at a 0.1% failure rate was larger than 2A for the FCQFN Cu pillar structure. “The FCQFN package has great potential in terms of its excellent anti-EM performance for future high-power applications,” the article said.

Designing/manufacturing for EM resiliency
Building electromigration resilience into advanced devices begins with using only EM-compliant linewidths in circuit designs based on the current density and heat profile that the interconnects will experience during operation over the lifetime of the device. Electromigration mitigation also requires process and materials engineering to ensure durability, for instance, of copper pillar bumps under BGA packages. It also calls for an optimized assembly process window and tight process control to prevent tiny violations of design rules that can later precipitate as EM failures.

As the industry makes its way toward true 3D packages, and eventually 3D-ICs, it seems clear that modeling and simulation will play an increasing role in determining many of the guard rails for manufacturing and assembly before manufacturing and assembly even begins. “Reliability modeling and simulation tools are being used to better understand the reliability of bump structures. This proactive approach helps in identifying potential issues before they arise, enabling engineers to implement preventive measures,” said Zwenger.

Modeling and simulation at the system level also will be essential to understanding the complex interplay between reliability mechanisms with thermal and mechanical stress in multi-chiplet systems during operation.

“Electromigration for stacked die is challenging,” said Synopsys’ Lynch. “Localized, die-to-die workloads cause repetitive current flow in specific areas. This generates local heat, increasing EM resulting in wire degradation, while producing even more heat. Reducing the thermal issue becomes critical to ensuring EM reliability.”

As stated previously, solder bumps can become a site for EM reliability failure. “Engineers fine-tune bump design in terms of bump size, pitch, and shape to ensure uniformity and reliability across the entire package. This includes the adoption of innovative Cu bump structures for improved mechanical and electrical properties,” said Amkor’s Zwenger.

In flip-chip BGA and other flip-chip applications, underfill materials — typically thermoset epoxies — are used to reduce the thermal stresses on solder bumps. “Underfill materials play a critical role in providing mechanical support and thermal stability to the bumps,” Zwenger said. “Engineers are investing in the development of advanced underfill formulations with enhanced properties, such as improved adhesion, thermal conductivity, and stress relief.”

Conclusion
Because of its dependence on temperature, electromigration is a failure mechanism to watch and plan for as devices continue to scale and systems integrators continue to cram more and more chiplets of various functions into advanced packages.

“In advanced technologies, the current density is now in close proximity to the maximum density,” said Synopsys’ Lynch. “Anything that causes an increase in temperature poses a threat. Designers of multi-die systems need to understand the impact of temperature and design systems to remove the heat.”

References

  1. JiHye Kwon, “Electromigration Performance Of Fine-Line Cu Redistribution Layer (RDL) For HDFO Packaging,” Semiconductor Engineering, Jan. 18, 2024, https://semiengineering.com/electromigration-performance-of-fine-line-cu-redistribution-layer-rdl-for-hdfo-packaging/
  2. -Y. Tsai, et al., “An Electromigration Study of Cu Pillar Interconnects in Flip-chip QFN Packaging under Extreme Conditions for High-power Applications,” 2023 IEEE 25th Electronics Packaging Technology Conference (EPTC), Singapore, 2023, pp. 326-332, doi: 10.1109/EPTC59621.2023.10457564.

Related Reading
What Can Go Wrong In Heterogeneous Integration
Workflows and tools are disconnected, mechanical stress is ill-defined, and complete co-planarity is nearly impossible. But there are solutions on the horizon.
Thermal Integrity Challenges Grow In 2.5D
Work is underway to map heat flows in interposer-based designs, but there’s much more to be done.
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.

The post Electromigration Concerns Grow In Advanced Packages appeared first on Semiconductor Engineering.

What Works Best For Chiplets

The semiconductor industry is preparing for the migration from proprietary chiplet-based systems to a more open chiplet ecosystem, in which chiplets fabricated by different companies of various technologies and device nodes can be integrated in a single package with acceptable yield.

To make this work as expected, the chip industry will have to solve a variety of well-documented technical and business issues, and it will have to rein in some of the grander visions of what’s possible — at least initially. The basic challenge is aligning domain-specific performance demands of end systems, which contain a growing number of chiplets, with the assembly and packaging capabilities and methodologies of IDMs, foundries, and OSATs. This includes the creation of assembly development kits (ADKs) that are roughly the equivalent of process development kits (PDKs), which today are codified with manufacturing specifications.

A PDK provides the appropriate level of detail needed to develop planar chips, marrying design tools with fab processes to achieve a predictable outcome. But making this work for an ADK with heterogeneous chiplets is many times more complex. Design and assembly teams need to manage thermal, mechanical, and electrical co-dependencies that cause electrical and mechanical stress, resulting in warpage, reduced yield, and reliability issues under real-world workloads. Layered on top of this the business and legal issues related to packaging of different devices from different manufacturers.

“Chiplets are a growing trend, especially in the HPC and networking segments, with potential to scale to other applications,” said Gabriela Pereira, technology and market analyst for semiconductor packaging at Yole Intelligence. “The industry has understood that high-end advanced packaging technologies are needed to connect them — but that’s much more complex than it seems. Connecting chiplets requires the design of high-bandwidth interconnections at the package level, which can take different forms — e.g., 2D, 2.5D or 3D — while ensuring that the thermal and power requirements are fulfilled.”

Commercial chiplet-based devices generally are domain-specific, and sometimes developed for a specific workload. So despite a big industry push to create a LEGO-like mix-and-match ecosystem for chiplets — which today includes multiple IP and EDA vendors, foundries, memory suppliers, OSATs, substrate suppliers, etc. — making this work as planned will require time and a massive amount of work.

Fig. 1: System assembly requires tighter coupling between chipmakers and OSATs. Source: ASE

Fig. 1: System assembly requires tighter coupling between chipmakers and OSATs. Source: ASE

In creating heterogeneous integrated designs, it’s essential to have much tighter collaboration between foundries, IDMs, OSATs, and PCB manufacturers. And because each chiplet-based system will be customized, the number of assembly processes will grow substantially. For example, one OSAT noted that among its ~5,000 customers, there are ~1,000 different assembly processes.

That diversity in products and processes makes it difficult to achieve predictable results by choosing chiplets from a large menu of options.

“We’ve already encountered a lot of limitations including not only the silicon, but also integration and the ecosystem,” said Lihong Cao, senior director at ASE Group, at MEPTEC’s Road to Chiplets forum. She stressed that customers continue to push for a low-cost chiplet assembly process, which is creating constructive tension between developing a sophisticated assembly process and the economic realities of different industry sectors. Computing devices for automotive have a higher cost sensitivity than for data centers, for example, but their chips operate in a harsher environment over a longer lifetime.

What’s needed is a defined set of assembly process recipes — basically, a highly limited menu of choices — that are specific to the end application (HPC, automotive, RF telecommunications) in order to lower the cost of chiplet-based systems. OSATs and foundries already are moving in that direction for high-performance computing. For example, at its 2024 Direct Connect event, Intel shared its six different package processes for chiplets. TSMC and Samsung also offer defined sets of chiplet processes. But the success of these assembly processes requires engineering teams to co-optimize the flows, processes, and materials to best match the system requirements.

Fig. 2: Integrated platform development requires tightly coupled architectural analysis that co-optimizes the system design to architecture to assembly process and packaging material selections. Source: Applied Materials

Fig. 2: Integrated platform development requires tightly coupled architectural analysis that co-optimizes the system design to architecture to assembly process and packaging material selections. Source: Applied Materials

“Previously, when we designed a system we only had to be worried about the system requirements. Once we start segregating into dies and reassembling them, we have to start looking at other things. We have to worry about putting them together while considering signal integrity between dies, reliability, thermals, etc.,” said Itai Leshniak, director of AI systems solutions at Applied Materials, at the MEPTEC forum. “If we take the case of AI-based computer vision, we can break it down layer by layer — on the hardware side, determining which computer vision processors, sensors, filters are needed to break it down into the architecture at layer. Then we begin to go through how to package all these chiplets, and then which materials to use and how to take advantage of those materials.”

Materials and assembly processes
Conceptually, design engineers will use chiplets to design a system. However, the co-design and integration is far more complicated than assembling a set of LEGO blocks, because the chiplets, interposers, and package substrates come from different design houses and manufacturing facilities. The advanced packaging technologies used to connect chiplets vary with an alphabet soup of names — FOWLP, FOPLP, CoWoS, etc., each of which poses additional design and material choices along with certain process limitations.

Fig. 3: There are a multitude of choices in multi-die packaging from the high-level layout to substrates, materials, bonding methods, and cooling materials. Source: Synopsys

Fig. 3: There are a multitude of choices in multi-die packaging from the high-level layout to substrates, materials, bonding methods, and cooling materials. Source: Synopsys

Currently engineering teams determine the tradeoffs among the different packaging options to select materials, derive a process recipe, and determine design rules.

Materials are a good starting point. “Materials are very important because they enable new products and packaging technologies,” Tanja Braun, deputy group manager at the Fraunhofer Institute for Reliability and Microintegration IZM. “As you move into more advanced packaging, process is getting much more complex because you are putting more things together. In the end, it’s a combination of equipment, materials, and process development.”

There are three thermal parameters that are critical in package assembly processes — coefficients of thermal expansion (CTE), glass transition temperature (Tg), and thermal conductivity. These factors affect how a material behaves in manufacturing to packaging processes, as well as how it behaves in the field.

“Challenges for our materials include temperature limitations of different die,” said Rama Puligadda, CTO at Brewer Science. “We have to ensure that the temperatures used for bonding materials don’t exceed the thermal limitations of any of the chips that are being integrated into the package. Additionally, there may be some subsequent processes like redistribution layer (RDL) formation or molding. Our materials have to survive those processes. They have to survive the chemicals they come in contact with throughout the packaging process scheme. Mechanical stresses in the package add additional challenges for bonding materials.”

Within a stack of chiplets-on-substrate with an optional interposer, their material attributes affect the thermal-mechanical stresses between neighboring materials, as well. This directly impacts interconnect dimensional control over a large area substrate area.

“If you go work the numbers, you will find that the level of tolerance and control required is frightening,” said Dick Otte, CEO of Promex Industries. “You’re talking about controlling dimensions equivalent to the width of a grass blade over the length of a football field, so that’s roughly 1 in 100,000.”

The goal is uniform heating of the structure in reflow in order to attain the best process results and to avoid cracking. “When you’re taking it through a 250 degrees centigrade temperature change, then you need to heat up slowly so that the top doesn’t get hot before the bottom does,” said Otte.

Multi-physics to comprehend co-optimization
Multi-physics modeling has become the go-to method for co-optimizing packaging design and assembly process development. That affects both permanent and temporary materials, as well the placement of processors, memories, and other components.

“You always looking to what the customer needs electrically, because that’s going to help define the material set. The material set is broadly applicable to a bunch of speed ranges. As long as you don’t step outside of those electrical specifications, theoretically you should be okay,” said Mike Kelly, vice president of advanced package and technology integration at Amkor Technology.

To save many iterations of empirically based development, engineers can use physics-based simulations to understand the impact of a material set’s properties impact on the assembly process, power/thermals, and mechanical vibrations.

Consider that HPC chiplet products can consume ~1,000 watts at peak performance so the power and thermal interactions need to be fully understood.

We’ve struggled, as everybody has, with this blizzard of complexity in the different techniques. Not only do they vary across different vendors, but they’re also varying over time,” said Marc Swinnen, director of product marketing at Ansys. “Our approach has been to identify the essentials that need to be worked on. We work jointly with customers to develop a simulation flow that actually achieves what is needed now.”

Materials are just one piece of the puzzle. “Then there’s the assembly stresses that need to be modeled to know whether you can correctly assemble this device. The third one is mechanical vibration,” Swinnen said. “Can your device withstand those regular vibrations? Modeling these attributes ties directly into our mechanical analysis tools — acoustic, thermal, vibration, etc. In the end, you’re going to have to do physics simulation. We’re trying to make it accessible to people in many different forms. But the bedrock of our tool offerings is that we have the meshing simulation and analysis. It’s a question of getting the data in the right format in a way that’s practical and usable.”

Evolving assembly design kits
For conventional packages, OSATs provide design rules for each packaging technology. These need to consider electrical, mechanical and thermal design requirements and manufacturing process limitations. In effect this is a multi-dimensional bounding box. Suppliers perform iterations with the customer to create a product specific process recipe.

Rules cover the macro-level attributes. “At a minimum, what you see from design rules is maximum package size, maximum silicon size, and whether silicon can be [mounted] on both sides of the substrate, such that when you follow these constructions the final product will have a lifetime of 1,000 thermal cycles, for example,” said Fraunhofer’s Braun.

In addition, design rules need to describe routing constraints for the interposer and/or redistribution layer, such as RDL line widths and spaces, ball-grid/pillar/pad size and pitches, and the maximum number of interconnections.

Breaking up a monolithic HPC device into multiple dies shifts some of the semiconductor design/process complexity into the packaging space. That makes things much more complicated. Consider that to connect 10 dies requires on order of 100,000 traces within the interposer’s or substrate’s redistribution layer.

To cope with the complexity at the chip level, the IC industry has long relied upon process design kits (PDKs) to capture design rules in an electronic file that can be imported into EDA tools. Their counterparts, assembly design kits (ADKs), are relatively immature.

“We call it Smart Package,” said Amkor’s Kelly. “It’s an ADK that we give to every customer who’s doing their own design. It is a set of macros, and a customization of a database tailored to a customer’s particular design. For chiplets, it is a high-density fan-out package technology. And it’s cognizant of the limitations for metal density and metal spacing, etc. This makes it easier for us to do design rule checks (DRCs).”

But right now, with the level of customization still required, how an ADK is derived and what it entails is in flux. Partnerships between EDA tool vendors, OSATs, and semiconductor device providers are required.

“We come from the IC world where everything is very rigid,” said Kenneth Larsen, director of 3D-IC product management in Synopsys‘ EDA Group. “On the OSAT side, and maybe this is because it’s so custom, design rules seem like a data sheet. Then you build and optimize the products over time or in collaboration with the OSAT. It’s not an electronic exchange. In the IC world, this would be totally unheard of. While it is possible to tweak a few things, you have a qualification process. And it seems like that’s not there yet for packaging.”

Materials and associated assembly recipes ultimately drive what’s possible for a chiplet-substrate stack in terms of pillar pitch, RDL line widths and spaces, bonding processes, and chiplet placement tolerances. But within a handful of ADKs, there are many possible interactions to consider.

The current focus is on co-optimizing the system design with the chiplet assembly process, leading to an assembly process development flow (see figure 4). This flow considers the needs of customization of an assembly process, and it creates the necessary design rules to be used by package designers.

Fig. 4: Chip-package hybrid flow. Source: ASE

Fig. 4: Chip-package hybrid flow. Source: ASE

“First you need to define your structure using chiplets. Are you using substrate RDL, 2.5D RDL, or a bridge? After that you need to consider your structure’s materials. What kind of material do you choose to fulfill your electrical performance and the mechanical stress requirements,” said Cao. “After that, you do pre-analysis to ensure all the structures and materials you use are workable in terms of electrical, warpage and mechanical stress.”

The design planning flow also includes the evaluation of die-to-die interconnects through the documents for co-design sign-off.

Conclusion
Before chiplet-based designs can be enabled outside the IDM model, the industry needs to complete the ecosystem that bridges the manufacturing and design complexity. This is because the need to co-optimize the system architecture based on materials, process, and integration capabilities is essential. While this would be easier with a set of well-defined products for the chiplet ecosystem to drive forward on, that has not happened yet.

Engineering teams across the design and manufacturing stack will need to collaborate to choose the appropriate materials, architectures, processes, etc., to develop a final chiplet-based product that is designable. As ASE group’s Cao noted, “An integrated design and manufacturing ecosystem is important. It is very critical to have collaboration among IDM, vendors, materials suppliers. Everyone needs to work together to really enable integration for the real applications.”

Related stories
Fan-Out Packaging Gets Competitive
Manufacturability reaches sufficient level to compete with flip-chip BGA and 2.5D.

Inside Panel-Level Fan-Out Technology
Fraunhofer’s panel experts dig into why this approach is needed and where the challenges are to making it work.

Next Steps For Panel-Level Packaging
Where it’s working, and what challenges remain for even broader adoption.

Mini-Consortia Forming Around Chiplets
Commercial chiplet marketplaces are still on the distant horizon, but companies are getting an early start with more limited partnerships.

What Can Go Wrong In Heterogeneous Integration
Workflows and tools are disconnected, mechanical stress is ill-defined, and complete co-planarity is nearly impossible. But there are solutions on the horizon.

Mechanical Challenges Rise With Heterogeneous Integration
But gaps in tools make it difficult to address warpage, structural issues, and new materials in multi-die/multi-chiplet designs.

The post What Works Best For Chiplets appeared first on Semiconductor Engineering.

Powering The Automotive Revolution: Advanced Packaging For Next-Generation Vehicle Computing

Automotive processors are rapidly adopting advanced process nodes. NXP announced the development of 5 nm automotive processors in 2020 [1], Mobileye announced EyeQ Ultra using 5 nm technology during CES 2022 [2], and TSMC announced its “Auto Early” 3 nm processes in 2023 [3]. In the past, the automotive industry was slow to adopt the latest semiconductor technologies due to reliability concerns and lack of a compelling need. Not anymore.

The use of advanced processes necessitates the use of advanced packaging as seen in high performance computing (HPC) and mobile applications because [4][5]:

  1. While transistor density has skyrocketed, I/O density has not increased proportionally and is holding back chip size reductions.
  2. Processors have heterogeneous, specialized blocks to support today’s workloads.
  3. Maximum chip sizes are limited by the slowdown of transistor scaling, photo reticle limits and lower yields.
  4. Cost per transistor improvements have slowed down with advanced nodes.
  5. Off-package dynamic random-access memory (DRAM) throttles memory bandwidth.

These have been drivers for the use of advanced packages like fan-out in mobile and 2.5D/3D in HPC. In addition, these drivers are slowly but surely showing up in automotive compute units in a variety of automotive architectures as well (see figure 1).

Fig. 1: Vehicle E/E architectures. (Image courtesy of Amkor Technology)

Vehicle electrical/electronic (E/E) architectures have evolved from 100+ distributed electronic control units (ECUs) to 10+ domain control units (DCUs) [6]. The most recent architecture introduces zonal or zone ECUs that are clustered in physical locations in cars and connect to powerful central computing units for processing. These newer architectures improve scalability, cost, and reliability of software-defined vehicles (SDVs) [7]. The processors in each of these architectures are more complex than those in the previous generation.

Multiple cameras, radar, lidar and ultrasonic sensors and more feed data into the compute units. Processing and inferencing this data require specialized functional blocks on the processor. For example, the Tesla Full Self-Driving (FSD) HW 3.0 system on chip (SoC) has central processing units (CPUs), graphic processing units (GPUs), neural network processing units, Low-Power Double Data Rate 4 (LPDDR4) controllers and other functional blocks – all integrated on a single piece of silicon [8]. Similarly, Mobileye EyeQ6 has functional blocks of CPU clusters, accelerator clusters, GPUs and an LPDDR5 interface [9]. As more functional blocks are introduced, the chip size and complexity will continue to increase. Instead of a single, monolithic silicon chip, a chiplet approach with separate functional blocks allows intellectual property (IP) reuse along with optimal process nodes for each functional block [10]. Additionally, large, monolithic pieces of silicon built on advanced processes tend to have yield challenges, which can also be overcome using chiplets.

Current advanced driver-assistance systems (ADAS) applications require a DRAM bandwidth of less than 60GB/s, which can be supported with standard double data rate (DDR) and LPDDR solutions. However, ADAS Level 4 and Level 5 will need up to 1024 GB/s memory bandwidth, which will require the use of solutions such as Graphic DDR (GDDR) or High Bandwidth Memory (HBM) [11][12].

Fig. 2: Automotive compute package roadmap. (Image courtesy of Amkor Technology)

Automotive processors have been using Flip Chip BGA (FCBGA) packages since 2010. FCBGA has become the mainstay of several automotive SoCs, such as EyeQ from Mobileye, Tesla FSD and NVIDIA Drive. Consumer applications of FCBGA packaging started around 1995 [13], so it took more than 15 years for this package to be adopted by the automotive industry. Computing units in the form of multichip modules (MCMs) or System-in-Package (SiP) have also been in automotive use since the early 2010s for infotainment processors. The use of MCMs is likely to increase in automotive compute to enable components like the SoC, DRAM and power management integrated circuit (PMIC) to communicate with each other without sending signals off-package.

As cars move to a central computing architecture, the SoCs will become more complex and run into size and cost challenges. Splitting these SoCs into chiplets becomes a logical solution and packaging these chiplets using fan-out or 2.5D packages becomes necessary. Just as FCBGA and MCMs transitioned into automotive from non-automotive applications, so will fan-out and 2.5D packaging for automotive compute processors (see figure 2). The automotive industry is cautious but the abovementioned architecture changes are pushing faster adoption of advanced packages. Materials, processes, and factory controls are key considerations for successful qualification of these packages in automotive compute applications.

In summary, the automotive industry is adopting advanced semiconductor technologies, such as 5 nm and 3 nm processes, which require the use of advanced packaging due to limitations in I/O density, chip size reductions, and memory bandwidth. Processors in the latest vehicle E/E architectures are more complex and require specialized functional blocks to process data from multiple sensors. As cars move to the central computing architecture, the SoCs will become more complex and run into size and cost challenges. Splitting these SoCs into chiplets becomes a logical solution and packaging these chiplets using fan-out or 2.5D technology becomes necessary.

Sources

  1. NXP. “NXP Selects TSMC 5nm Process for Next-Generation High-Performance Automotive Platform.” NXP, https://www.nxp.com/company/about-nxp/nxp-selects-tsmc-5nm-process-for-next-generation-high-performance-automotive-platform:NW-TSMC-5NM-HIGH-PERFORMANCE.
  2. Mobileye. “Mobileye at CES 2022.” Mobileye, https://www.mobileye.com/news/mobileye-ces-2022-tech-news/.
  3. Business Wire. “TSMC Showcases New Technology Developments at 2023 Technology Symposium.” Business Wire, https://www.businesswire.com/news/home/20230426005359/en/TSMC-Showcases-New-Technology-Developments-at-2023-Technology-Symposium.
  4. Swaminathan, Raja. “Advanced Packaging: Enabling Moore’s Law’s Next Frontier Through Heterogeneous Integration.” HotChips33, https://hc33.hotchips.org/assets/program/tutorials/2021%20Hot%20Chips%20AMD%20Advanced%20Packaging%20Swaminathan%20Final%20%2020210820.pdf
  5. SemiAnalysis. “Advanced Packaging Part 1” SemiAnalysis, https://www.semianalysis.com/p/advanced-packaging-part-1-pad-limited?utm_source=%2Fsearch%2Fadvanced%2520packaging&utm_medium=reader2.
  6. McKinsey & Company. “Getting Ready for Next-Generation EE Architecture with Zonal Compute.” McKinsey & Company, https://www.mckinsey.com/industries/semiconductors/our-insights/getting-ready-for-next-generation-ee-architecture-with-zonal-compute.
  7. NXP. “How Zonal E/E Architectures with Ethernet are Enabling Software-Defined Vehicles.” NXP, https://www.nxp.com/company/blog/how-zonal-e-e-architectures-with-ethernet-are-enabling-software-defined-vehicles:BL-HOW-ZONAL-EE-ARCHITECTURES.
  8. WikiChip. “Tesla (Car Company)/FSD Chip.” WikiChip, https://en.wikichip.org/wiki/tesla_(car_company)/fsd_chip.
  9. Mobileye. “EyeQ Chip.” Mobileye, https://www.mobileye.com/technology/eyeq-chip/.
  10. Ziadeh, Bassam. “Driving Adoption of Advanced IC Packaging in Automotive Applications.” Presentation at IMAPS DPC, March 2023. General Motors, Fountain Hills AZ, March 16, 2023.
  11. K Matthias Jung and Norbert Wehn. “Driving Against the Memory Wall: The Role of Memory for Autonomous Driving.” Fraunhofer IESE, Kaiserslautern, Germany, and Microelectronic Systems Design Research Group, University of Kaiserslautern, Kaiserslautern, Germany. Kluedo, https://kluedo.ub.rptu.de/frontdoor/deliver/index/docId/5286/file/_memory.pdf.
  12. Micron. “Cinco de Play: Memory – Is That Critical to Autonomous Driving?” Micron, https://www.micron.com/about/blog/2017/october/cinco-play-memory-is-that-critical-to-autonomous-driving.
  13. McKinsey & Company. “Advanced Chip Packaging: How Manufacturers Can Play to Win.” McKinsey & Company, https://www.mckinsey.com/industries/semiconductors/our-insights/advanced-chip-packaging-how-manufacturers-can-play-to-win.

The post Powering The Automotive Revolution: Advanced Packaging For Next-Generation Vehicle Computing appeared first on Semiconductor Engineering.

Advanced Packaging Design For Heterogeneous Integration

Od: CP Hung

As device scaling slows down, a key system functional integration technology is emerging: heterogeneous integration (HI). It leverages advanced packaging technology to achieve higher functional density and lower cost per function. With the continuous development of major semiconductor applications such as AI HPC, edge AI and autonomous electrical vehicles, traditional chips are transforming into smaller, well-partitioned chiplets that require chip-to-chip interconnections to be denser, faster and more reliable. This boosts the demand for heterogeneous integration, elevating demand for innovative advanced packaging technologies.

HI uses advanced packaging to integrate chiplets with heterogeneous designs and process nodes into a single package. This allows enterprises to choose optimum process nodes for specific system demands, such as 3nm for computing chiplets, 7nm for radio frequency chiplets, or to quickly produce super chips with specific functions in a cost-effective manner. HI not only aims for higher interconnection density, but also integrates various functional components, such as logic chips, sensors, memory, and others, which are needed to complete the whole system in one package. Overall energy efficiency and performance is greatly improved, while package size can be significantly reduced.

Advanced packaging solutions for AI HPC

The typical high-density advanced package size for AI cloud computing processors is 55mm x 55mm or more, and contains a 5-2-5 (top 5 layers, middle 2 layers, bottom 5 layers) advanced substrate, or even up to 11-2-11 wiring layers. Chiplets can be interconnected by fan-out technology with silicon bridge or 2.5D with Si Interposer as the integration platform. Through this technique, industry aims to gain more computing power within the same space.

ASE provides high-density packaging solutions, including Flip Chip Ball Grid Array (FCBGA), Fan Out Chip-on-Substrate (FOCoS), FOCoS-Bridge and 2.5D. The chip-to-chip interconnections in FCBGA is accomplished through BGA substrate, and its minimum L/S (line width/line spacing) is only about 10μm/10μm. The very popular and in-demand CoWoS (Chip on Wafer on Substrate) is a 2.5D packaging technology that uses RDL (redistribution layer) on Si interposer to connect chiplets, and its L/S can be significantly reduced to 0.5μm/0.5μm.

In the Si interposer of a 2.5D package, all the chiplets are connected in a side-by-side arrangement, and as the required number of chiplets increases, its area becomes larger and larger, resulting in fewer and fewer Si interposer chips that can be made from each 12-inch wafer (generally less than 50). This indeed significantly increases the manufacturing cost of 2.5D packaging. However, not all applications require 0.5μm/0.5μm L/S, so ASE came up with FOCoS, which uses fan-out technology’s RDL to integrate different chiplets, and its L/S can reach 2μm/2μm. This gives alternative solutions to the market with lower costs. In addition, ASE’s FOCoS-Bridge technology uses silicon bridge to provide high-density routing for interconnecting different chips (such as logic chips and memory) in areas that require high-speed transmission and uses Fan-Out RDL to integrate in other areas. As such, it delivers both 0.5μm/0.5μm and 2μm/2μm flexibility in L/S design, while achieving a significant increase in packaging density and bandwidth.

High performance chip-package-system co-design

To achieve the aforementioned high bandwidth, the chip, package, and entire system must be designed together to achieve holistic design optimization instead of just considering the individual parts. When using electronic design automation (EDA) for design optimization, consideration must be given to overall signal change along the entire transmission path, including Cu pillar, RDL fine line, TSV, μbump, etc. Eye diagrams can then be used to analyze the SerDes link’s electrical performance. When designing differential pairs for high-speed signals, it is necessary to reduce return and insertion loss, especially in the operating frequency band. From chip to package to the entire system, Taiwan’s manufacturing advantage lies in the ability to accomplish the turnkey design process, from beginning to end.

Providing more computing power with less energy

The industry is currently focused on optimizing energy efficiency. One of the key questions being asked is whether the power regulation and decoupling components, which were previously located on the system board, can be moved closer to the package or processor chip. There is even talk of redesigning the on-chip power delivery network (PDN), including supplying power directly from the backside of the chip (Backside PDN).

Power integrity design for power delivery network (PDN)

Optimizing power integrity and minimizing noise can be achieved by strategically positioning the capacitor. Ideally, the capacitor should be placed as close to the chip as possible, but this is dependent on the capacitor’s size and the manufacturing process, both of which can impact cost and performance. Traditional surface-mount technology (SMT) capacitors are relatively large, but chip-level silicon capacitors (Si-Cap) are now available that offer decent capacitance values.

UCIe (Universal Chiplet Interconnect Express) Consortium

Traditionally, there are many standard communication protocols (such as Block-to-Block, Memory Bus, or Interconnection Interface Protocols) at the chip level and the board level for system designers. Industry protocols that specify package-level integration are growing, especially given the need for a universal interface for chiplet integration using 2.5D and FOCoS packaging technologies.

In March 2022, Intel invited upstream and downstream manufacturers in the semiconductor industry chain to form the UCIe Consortium, and a standardized data transmission architecture for chiplet integration was introduced to reduce the cost of advanced packaging design. ASE is proud to be a founding member (Promoter member).

ASE offers a diverse range of advanced packaging types. We have developed packaging design specifications that can be integrated with foundry solutions specifications as well as the system requirements of original equipment manufacturers (OEMs) and cloud service providers to create a comprehensive UCIe package standard. The standard can assist in realizing ubiquitous chiplet heterogeneous integration for HPC applications using various advanced packaging technology architectures, such as 2.5D, 3D, FOCoS, Fan-out, EMIB, CoWoS, etc. Headquartered in Taiwan, ASE is enthusiastically participating in the formulation of international standards and relentlessly providing integrated solutions to the global industry.

Heterogeneous integration has been in development for many years. It can be used to integrate not only homogeneous and heterogeneous chiplets but also other passive and active components including connectors, into a single package. Achieving this requires not only advanced packaging technologies but also design and testing coordination. ASE offers a comprehensive one-stop service solution that includes system design, packaging, and testing to help customers shorten chip design cycles and accelerate product innovation.

The post Advanced Packaging Design For Heterogeneous Integration appeared first on Semiconductor Engineering.

Commercial Chiplet Ecosystem May Be A Decade Away

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

L-R: Arteris’ Schirrmeister, Cadence’s Bhatnagar, Expedera’s Karazuba, Keysight’s Slater, Siemens EDA’s Rinebold, and Synopsys’ Posner.

SE: There’s a lot of buzz and activity around every aspect of chiplets today. What is your impression of where the commercial chiplet ecosystem stands today?

Schirrmeister: There’s a lot of interest today in an open chiplet ecosystem, but we are probably still quite a bit away from true openness. The proprietary versions of chiplets are alive and kicking out there. We see them in designs. We vendors are all supporting those to make it reality, like the UCIe proponents, but it will take some time to get to a fully open ecosystem. It’s probably at least three to five years before we get to a PCI Express type exchange environment.

Bhatnagar: The commercial chiplet ecosystem is at a very early stage. Many companies are providing chiplets, are designing them, and they’re shipping products — but they’re still single-vendor products, where the same company is designing all the pieces. I hope that with the advancements the UCIe standard is making, and with more standardization, we eventually can get to a marketplace-like environment for chiplets. We are not there.

Karazuba: The commercialization of homogeneous chiplets is pretty well understood by groups like AMD. But for the commercialization of heterogeneous chiplets, which is chiplets from multiple suppliers, there are still a lot of questions out there about that.

Slater: We participate in a lot of the board discussions, and attend industry events like TSMC’s OIP, and there’s a lot of excitement out there at the moment. I see a lot of even midsize and small customers starting to think about their development plans for what chiplet should be. I do think those that are going to be successful first will be those that are within a singular foundry ecosystem like TSMC’s. Today if you’re selecting your IP, you’ve got a variety of ways to be able to pick and choose which IP, see what’s been taped out before, how successful it’s been so you have a way to manage your risk and your costs as you’re putting things together. What we’ll see in the future will be that now you have a choice. Are you purchasing IP, or are you purchasing chiplets? Crucially, it’s all coming from the same foundry and put together in the same manner. The technical considerations of things like UCIe standard packaging versus advanced packaging, and the analysis tool sets for high-speed simulation, as well as for things like thermal, are going to just become that much more important.

Rinebold: I’ve been doing this about 30 years, so I can date back to some of the very earliest days of multi-chip modules and such. When we talk about the ecosystem, there are plenty of examples out there today where we see HBM and logic getting combined at the interposer level. This works if you believe HBM is a chiplet, and that’s a whole other argument. Some would argue that HBM falls into that category. The idea of a true LEGO, snap-together mix and match of chiplets continues to be aspirational for the mainstream market, but there are some business impediments that need to get addressed. Again, there are exceptions in some of the single-vendor solutions, where it’s more or less homogeneous integration, or an entirely vertically integrated type of environment where single vendors are integrating their own chiplets into some pretty impressive packages.

Posner: Aspirational is the word we use for an open ecosystem. I’m going to be a little bit more of a downer by saying I believe it’s 5 to 10 years out. Is it possible? Absolutely. But the biggest issue we see at the moment is a huge knowledge gap in what that really means. And as technology companies become more educated on really what that means, we’ll find that there will be some acceleration in adoption. But within what we call ‘captive’ — within a single company or a micro-ecosystem — we’re seeing multi-die systems pick up.

SE: Is it possible to define the pieces we have today from a technology point of view, to make a commercial chiplet ecosystem a reality?

Rinebold: What’s encouraging is the development of standards. There’s some adoption. We’ve already mentioned UCIe for some of the die-to-die protocols. Organizations like JEDEC announced the extension of their JEP30 PartModel format into the chiplet ecosystem to incorporate chiplet-style data. Think about this as an electronic data sheet. A lot of this work has been incorporated into the CDX working group under Open Compute. That’s encouraging. There were some comments a little bit earlier about having an open marketplace. I would agree we’re probably 3 to 10 years away from that coming to fruition. The underlying framework and infrastructure is there, but a lot of the licensing and distribution issues have to get resolved before you see any type of broad adoption.

Posner: The infrastructure is available. The EDA tools to create, to package, to analyze, to simulate, to manufacture — those tools are all there. The intellectual property that sits around it, either UCIe or some of the more traditional die-to-die interfaces, all of that’s there. What’s not established are full methodology and flows that lead to interoperability. Everything within captive is possible, but a broader ecosystem, a marketplace, is going to require silicon interoperability, simulation, packaging, all of that. That’s the area that we believe is missing — and still building.

Schirrmeister: Do we know what’s required? We probably can define that reasonably well. If the vision is an open ecosystem with IP on chiplets that you can just plug together like LEGO blocks, then the IP industry informs us of what’s required, and then there are some gaps on top of them. I hear people from the hard-coded IP world talking about the equivalent of PDKs for chiplets, but today’s IP ecosystem and the IP deliverables are informing us it doesn’t work like LEGO blocks yet. We are improving every year. But this whole, ‘I take my whiteboard and then everything just magically functions together’ is not what we have today. We need to think really hard about what the additional challenges are when you disaggregate that into chiplets and protocols. Then you get big systemic issues to deal with, like how do you deal with coherency across chiplets? It was challenging enough to get it done on a chip. Now you potentially have to deal with other partnerships you don’t even own. This is not a captive environment in an open ecosystem. That makes it very challenging, and it creates job security for at least 5 to 10 years.

Bhatnagar: On the technical side, what’s going well is adoption. We can see big companies like Intel, and then of course, IP providers like us and Synopsys. Everybody’s working toward standardizing chiplet integration, and that is working very well. EDA tools are also coming up to support that. But we are still very far from a marketplace because there are many issues that are not sorted out, like licensing and a few other things that need a bit more time.

Slater: The standards bodies and networking groups have excited a lot of people, and we’re getting a broad set of customers that are coming along. And one point I was thinking, is this only for very high-end compute? From the companies that I see presenting in those types of forums, it’s even companies working in automotive or aerospace/defense, planning out their future for the next 10 years or more. In the automotive case, it was a company that was thinking about creating chiplets for internal consumption — so maybe reorganizing how they look at creating many different variations or evolutions of their products, trying to do it as more modular chiplet types of blocks. ‘If we take the microprocessor part of it, would we sell that as a chiplet externally for other customers to integrate together into a bigger design?’ For me, the aha moment was seeing how broad the application would be. I do think that the standards work has been moving very fast, and that’s worked really well. For instance, at Keysight EDA, we just released a chiplet PHY designer. It’s a simulation for the high-speed digital link for UCIe, and that only comes about by having a standard that’s published, so an EDA company can take a look at it and realize what they need to do with it. The EDA tools are ready to handle these kinds of things. And maybe then, on to the last point is, in order to share the IP, in order to ensure that it’s available, database and process management is going to become all the more important. You need to keep track of which chip is made on which process, and be able to make it available inside the company to other potential users of that.

SE: What’s in place today from a business perspective, and what still needs to be worked out?

Karazuba: From a business perspective, speaking strictly of heterogeneous chiplets, I don’t think there’s anything really in place. Let me qualify that by asking, ‘Who is responsible for warranty? Who is responsible for testing? Who is responsible for faults? Who is responsible for supply chain?’ With homogeneous chiplets or monolithic silicon, that’s understood because that’s the way that this industry has been doing business since its inception. But when you talk about chiplets that are coming from multiple suppliers, with multiple IPs — and perhaps different interfaces, made in multiple fabs, then constructed by a third party, put together by a third party, tested by a fourth party, and then shipped — what happens when something goes wrong? Who do you point the finger at? Who do you go to and talk to? If a particular chiplet isn’t functioning as intended, it’s not necessarily that chiplet that’s bad. It may be the interface on another chiplet, or on a hub, whatever it might be. We’re going to get there, but right now that’s not understood. It’s not understood who is going to be responsible for things such as that. Is it the multi-chip module manufacturer, or is it the person buying it? I fear a return to the Wintel issue, where the chipmaker points to the OS maker, which points at the hardware maker, which points at the chipmaker. Understanding of the commercial side is is a real barrier to chiplets being adopted. Granted, the technical is much more difficult than the commercial, but I have no doubt the engineers will get there quicker than the business people.

Rinebold: I completely agree. What are the repercussions, warranty-related issues, things like that? I’d also go one step further. If you look at some of the larger silicon foundries right now, there is some concern about taking third-party wafers into their facilities to integrate in some type of heterogeneous, chiplet-type package. There are a lot of business and logistical issues that have to get addressed first. The technical stuff will happen quickly. It’s just a lot of these licensing- and distribution-type issues that need to get resolved. The other thing I want to back up to involves customers in the defense/industrial space. The trust and traceability and the province tracking of IP is going to be key for them, because they have so much expectation of multi-die or chiplet-type packaging as an alternative to monolithic scaling. Just look at all the government programs out there right now, with RESHAPE [Reshore Ecosystem for Secure Heterogeneous Advanced Packaging Electronics] and NGMM [Next-Generation Microelectronics Manufacturing] and such. They’re all in on this chiplet perspective, but they’re going to require a lot of security measures to understand who has touched the IP, where it comes from, how to you verify that.

Posner: Micro-ecosystems are forming because of all these challenges. If you naively think you can just go pick a die off the shelf and put it into your device, how do you warranty that? Who owns it? These micro-ecosystems are building up to fundamentally sort that out. So within a couple of different companies, be it automotive or high-performance compute, they’ll come to terms that are acceptable across all of them. And it’s these micro-ecosystems that are really going to end up driving open chiplets, and I think it’s going to be an organic type of growth. Chiplets are available for a specific application today, but we have this vision that someone else could use it, and we see that with the multiple modes being built into the dies. One mode is, ‘I’m connecting to myself. It’s a very tight, low-latency link.’ But I have this vision in the future that I’m going to need to have an interface or protocol that is more open and uses standard available stacks, and which can be bought off the shelf and integrated. That’s one side of the logistics. I want to mention two more things. It is possible to do interoperability across nodes. We demonstrated our TSMC N3 UCIe with Intel’s in-house UCIe, all put together on an Intel process. This was two separate companies working together, showing the first physical interoperability, so it’s possible. But going forward that’s still just a small part of the overall effort. In the IP space we’ve lived with an IP model of, ‘Build once, sell many.’ With the chiplet marketplace, unless there is a revenue stream from that chiplet, it will break that model. Companies think, ‘I only have to buy the IP once, and then I’m selling my silicon.’ But the infrastructure, the resources that are required to build all of this does not go away. There has to be money at the end of that tunnel for all of these different companies to be investing.

Schirrmeister: Mick is 100% right, but we may have a definition issue here with what we really mean by an ‘open’ chiplet ecosystem. I have two distinct conversations when I talk to partners and customers. On the one hand, you have board designers who are doing more and more integration, and they look at you with a wrinkled forehead and say, ‘We’ve been doing this for years. What are you talking about?’ It may not have been 3D-IC in the classic sense of all items, but they say, ‘Yeah, there are issues with warranties, and the user figures it out.’ The board people arrive from one side of the equation at chiplets because that’s the next evolution of integration. You need to be very efficient. That’s not what we call an open ecosystem of chiplets here. The idea is that you have this marketplace to mix things up, and you have the economies of scale by selling the same chiplet to multiple people. That’s really what the chip designers are thinking about, and some of them think even further because if you do it all in true 3D-IC fashion, then you actually have to co-design those chiplets in a way, and that’s a whole other dimension that needs to be sorted out. To pick a little bit on the big companies that have board and chip design groups in house, you see this even within the messaging of these companies. You have people who come from the board side, and for them it’s not a solved problem. It always has been challenging, but they’re going to take it to the next level. The chip guys are looking at this from a perspective of one interface, like PCI Express, now being UCIe. And then I think about this because the networks on chip need to become super NoCs across chiplets, which poses its own challenges. And that all needs to work together. But those are really chiplets designed for the purpose of being in a chiplet ecosystem. And to that end, Mick’s estimation of longer than five years is probably correct because those purpose-built chiplets, for the purpose of being in an open ecosystem, have all these challenges the board guys have already been dealing with for quite some time. They’re now ‘just getting smaller’ in the amount of integration they do.

Slater: When you put all these chiplets together and start to do that integration, in what order do you start placing the components down? You don’t want to throw away one very expensive chiplet because there was an issue with one of the smaller cheaper ones. So, there are now a lot of thoughts about how to go about doing almost like unit tests on individual chiplets first, but then you want to do some form of system test as you go along. That’s definitely something we need to think about. On the business front,  who is going to be most interested in purchasing a chiplet-style solution. It comes down to whether you have a yield problem. If your chips are getting to the size where you have yield concerns, then definitely it makes sense to think about using chiplets and breaking it up into smaller pieces. Not everything scales, so why move to the lowest process node when you could purchase something at a different process node that has less risk and costs less to manufacture, and then put it all together. The ones that have been successful so far — the big companies like Intel, AMD — were already driven to that edge. The chips got to a size that couldn’t fit on the reticle. We think about how many companies fit into that category, and that will factor into whether or not the cost and risk is worth it for them.

Bhatnagar: From a business perspective, what is really important is the standardization. Inside of the chiplets is fine, but how it impacts other chiplets around it is important. We would like to be able to make something and sell many copies of it. But if there is no standardization, then either we are taking a gamble by going for one thing and assuming everybody moves to it, or we make multiple versions of the same thing and that adds extra costs. To really justify a business case for any chiplet or, or any sort of IP with the chiplet, the standardization is key for the electrical interconnect, packaging, and all other aspects of a system.

Fig. 1:  A chiplet design. Source: Cadence. 

Related Reading
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.
Proprietary Vs. Commercial Chiplets
Who wins, who loses, and where are the big challenges for multi-vendor heterogeneous integration.

The post Commercial Chiplet Ecosystem May Be A Decade Away appeared first on Semiconductor Engineering.

Accellera Preps New Standard For Clock-Domain Crossing

Part of the hierarchical development flow is about to get a lot simpler, thanks to a new standard being created by Accellera. What is less clear is how long will it take before users see any benefit.

At the register transfer level (RTL), when a data signal passes between two flip flops, it initially is assumed that clocks are perfect. After clock-tree synthesis and place-and-route are performed, there can be considerable timing skew between the clock edges arriving those adjacent flops. That makes timing sign-off difficult, but at least the clocks are still synchronous.

But if the clocks come from different sources, are at different frequencies, or a design boundary exists between the flip flops — which would happen with the integration of IP blocks — it’s impossible to guarantee that no clock edges will arrive when the data is unstable. That can cause the output to become unknown for a period of time. This phenomenon, known as metastability, cannot be eliminated, and the verification of those boundaries is known as clock-domain crossing (CDC) analysis.

Special care is required on those boundaries. “You have to compensate for metastability by ensuring that the CDC crossings follow a specific set of logic design principles,” says Prakash Narain, president and CEO of Real Intent. “The general process in use today follows a hierarchical approach and requires that the clock-domain crossing internal to an IP is protected and safe. At the interface of the IP, where the system connects with the IP, two different teams share the problem. An IP provider may recommend an integration methodology, which often is captured in an abstraction model. That abstraction model enables the integration boundary to be verified while the internals of it will not be checked for CDC. That has already been verified.”

In the past, those abstract models differentiated the CDC solutions from veracious vendors. That’s no longer the case. Every IP and tool vendor has different formats, making it costly for everyone. “I don’t know that there’s really anything new or differentiating coming down the pipe for hierarchical modeling,” says Kevin Campbell, technical product manager at Siemens Digital Industries Software. “The creation of the standard will basically deliver much faster results with no loss of quality. I don’t know how much more you can differentiate in that space other than just with performance increases.”

While this has been a problem for the whole industry for quite some time, Intel decided it was time for a solution. The company pushed Accellera to take up the issue, and helped facilitate the creation of the standard by chairing the committee. “I’m going to describe three methods of building a product,” says Iredamola “Dammy” Olopade, chair of the Accellera working group, and a principal engineer at Intel. “Method number one is where you build everything in a monolithic fashion. You own every line of code, you know the architecture, you use the tool of your choice. That is a thing of the past. The second method uses some IP. It leverages reuse and enables the quick turnaround of new SoCs. There used to be a time when all IPs came from the same source, and those were integrating into a product. You could agree upon the tools. We are quickly moving to a world where I need to source IPs wherever I can get them. They don’t use the same tools as I do. In that world, common standards are critical to integrating quickly.”

In some cases, there is a hierarchy of IP. “Clock-domain crossings are a central part of our business,” says Frank Schirrmeister, vice president of solutions and business development at Arteris. “A network-on-chip (NoC) can be considered as ‘CDC central’ because most blocks connected to the NoC have different clocks. Also, our SoC integration tools see all of the blocks to be integrated, and those touch various clock domains and therefore need to deal with the CDC code that is inserted.”

This whole thing can become very messy. “While every solution supports hierarchical modeling, every tool has its own model solution and its own model representation,” says Siemens’ Campbell. “Vendors, or users, are stuck with a CDC solution, because the models were created within a certain solution. There’s no real transportability between any of the hierarchical modeling solutions unless they want to go regenerate models for another solution.”

That creates a lot of extra work. “Today, when dealing with customer CDC issues, we have to consider the customer’s specific environment, and for CDC, a potential mix of in-house flows and commercial tools from various vendors,” says Arteris’ Schirrmeister. “The compatibility matrix becomes very complex, very fast. If adopted, the new Accellera CDC standard bears the potential to make it easier for IP vendors, like us, to ensure compatibility and reduce the effort required to validate IP across multiple customer toolsets. The intent, as specified in the requirements is that ‘every IP provider can run its tool of choice to verify and produce collateral and generate the standard format for SoCs that use a different tool.'”

Everyone benefits. “IP providers will not need to provide extra documentation of clock domains for the SoC integrator to use in their CDC analysis,” says Ahmed Nasr, digital design manager at Mixel. “The standard CDC attributes generated by the EDA tool will be self-contained.”

The use model is relatively simple. “An IP developer signs off on CDC and then exports the abstract model,” says Real Intent’s Narain. “It is likely they will write this out in both the Accellera format and the native format to provide backward compatibility. At the next level of hierarchy, you read in the abstract model instead of reading in the full view of the design. They have various views of the IP, including the CDC view of the IP, which today is on the basis of whatever tool they use for CDC sign-off.”

The potential is significant. “If done right and adopted, the industry may arrive at a common language to describe CDC aspects that can streamline the validation process across various tools and environments used by different users,” says Schirrmeister. “As a result, companies will be able to integrate and validate IP more efficiently than before, accelerating development cycles and reducing the complexity associated with SoC integration.”

The standard
Intel’s Olopade describes the approach that was taken during the creation of the standard. “You take the most complex situations you are likely to find, you box them, and you co-design them in order to reduce the risk of bugs,” he said. “The boundaries you create are supposed to be simple boundaries. We took that concept, and we brought it into our definition to say the following: ‘We will look at all kinds of crossings, we will figure out the simple common uses, and we will cover that first.’ That is expected to cover 95% to 98% of the community. We are not trying to handle 700 different exceptions. It is common. It is simple. It is what guarantees production quality, not just from a CDC standpoint, but just from a divide-and-conquer standpoint.”

That was the starting point. “Then we added elements to our design document that says, ‘This is how we will evaluate complexity, and this is how we’ll determine what we cover first,'” he says. “We broke things down into three steps. Step one is clock-domain crossing. Everyone suffers from this problem. Step two is reset-domain crossing (RDC). As low power is getting into more designs, there are a lot more reset domains, and there is risk between these reset domains. Some companies care, but many companies don’t because they are not in a power-aware environment. It became a secondary consideration. Beyond the basic CDC in phase one, and RDC in phase two, all other interesting, small usage complexities will be handled in phase three as extensions to the standard. We are not going to get bogged down supporting everything under the sun.”

Within the standards group there are two sub-groups — a mapping team and a format team. Common standards, such as AMBA, UCIe, and PCIe have been looked at to make sure that these are fully covered by the standard. That means that the concepts should be useful for future markets.

“The concepts contained in the standard are extensible to hardened chiplets,” says Mixel’s Nasr. “By providing an accurate standard CDC view for the chiplet, it will enable integration with other chiplets.”

Some of those issues have yet to be fully explored. “The standard’s current documentation primarily focuses on clock-domain crossing within an SoC itself,” says Schirrmeister. “Its direct applicability to the area of chiplets would depend on further developments. The interfaces between fully hardened IP blocks on chiplets would communicate through standard interfaces like UCIe, BoW, or XSR, so the synchronization issues between chiplets on substrates would appear to be elevated to the protocol levels.”

Reset-domain crossings have yet to appear in the standard. “The genesis of CDC is asynchronous clocks,” says Narain. “But the genesis for reset-domain crossing is asynchronous resets. While the destination is due to the clock, the source of the problem is somewhere else. And as a result, the nature of the problem, the methodology that people use to manage that problem, are very different. The kind of information that you need to retain, and the kind of information that you can throw away, is different for every problem. Hence, abstractions are actually very customized for the application.”

Does the standard cover enough ground? That is part of the purpose of the review period that was used to collect information. “I can see some room for future improvement — for example, making some attributes mandatory like logic, associated_clocks, clock_period for clock ports,” says Nasr. “Another proposed improvement is adding reconvergence information, to be able to detect reconverging outputs of parallel synchronizers.”

The impact of all of this, if realized, is enormous. “If you truly run a collaborative, inclusive, development cycle, two things will happen,” says Olopade. “One, you are going to be able to find multiple ways to solve each problem. You need to understand the pros and cons against the real problems you are trying to solve and agree on the best way we should do it together. For each of those, we record the options, the pros and cons, and the reason one was selected. In a public review, those that couldn’t be part of that discussion get to weigh in. We weigh it against what they are suggesting versus why did we choose it. In the cases where it is part of what we addressed, and we justified it, we just respond, and we do not make a change. If you’re truly inclusive, you do allow that feedback to cause you to change your mind. We received feedback on about three items that we had debated, where the feedback challenged the decisions and got us to rehash things.”

The big challenge
Still, the creation of a standard is just the first step. Unless a standard is fully adopted, its value becomes diminished. “It’s a commendable objective and a worthy endeavor,” says Schirrmeister. “It will make interoperability easier and eventually allow us, and the whole industry, to reduce the compatibility matrix we maintain to deal with vendor tools individually. It all will depend on adoption by the vendors, though.”

It is off to a good start. “As with any standard, good intentions sometimes get severed by reality,” says Campbell. “There has been significant collaboration and agreements on how the standard is being pushed forward. We did not see self-submarining, or some parties playing nice just to see what’s going on but not really supporting it. This does seem like good collaboration and good decision making across the board.”

Implementation is another hurdle. “Will it actually provide the benefit that it is supposed to provide?” asks Narain. “That will depend upon how completely and how quickly EDA tool vendors provide support for the standard. From our perception, the engineering challenge for implementing this is not that large. When this is standardized, we will provide support for it as soon as we can.”

Even then, adoption isn’t a slam dunk. “There are short- and long-term problems,” warns Campbell. “IP vendors already have to support multiple formats, but now you have to add Accellera on top of that. There’s going to be some pain both for the IP vendors and for EDA vendors. We are going to have to be backward-compatible and some programs go on for decades. There’s a chance that some of these models will be around for a very long time. That’s the short-term pain. But the biggest hurdle to overcome for a third-party IP vendor, and EDA vendor, is quality assurance. The whole point of a hierarchical development methodology is faster CDC closure with no loss in quality. The QA load here is going to be big, because no customer is going to want to take the risk if they’ve got a solution that is already working well.”

Some of those issues and fears are expected to be addressed at the upcoming DVCon conference. “We will be providing a tutorial on CDC,” says Olopade. “The first 30 minutes covers the basics of CDC for those who haven’t been doing this for the last 10 years. The next hour will talk about the Accellera solution. It will concentrate on those topics which were hotly debated, and we need to help people understand, or carry people along with what we recommend. Then it may become more acceptable and more adoptive.”

Related Reading
Design And Verification Methodologies Breaking Down
As chips become more complex, existing tools and methodologies are stretched to the breaking point.

The post Accellera Preps New Standard For Clock-Domain Crossing appeared first on Semiconductor Engineering.

Intel, And Others, Inside

Intel this week made a strong case for how it will regain global process technology leadership, unfurling an aggressive technology and business roadmap that includes everything from several more process node shrinks that ultimately could scale into the single-digit angstrom range to a broad shift in how it approaches the market. Both will be essential for processing the huge amount of data for AI everywhere, and to win back some of the market share that NVIDIA currently wields.

Intel’s strategy is many layers thick. It includes a long list of innovations, including backside power delivery, which significantly reduces congestion and noise in more than a dozen metal layers, to 2.5D and 3D-ICs using a standard architecture for internally developed and third-party chiplets and IP. And it encompasses everything from a broad and open ecosystem — a sharp departure from its past own-it-all strategy, which once prompted anti-competitive charges — as well as a willingness to work more closely with customers, competitors, and governments to achieve its goals.

Whether Intel delivers on that promise remains to be seen. But the breadth of its vision, particularly for Intel Foundry, and the ambitious delivery schedule represent a sharp change in direction and culture for the 56-year-old chipmaker.

Of particular note is how the companies internal silos are being deconstructed. Intel CEO Pat Gelsinger said there will be a clean line between Intel products and Intel Foundry, but noted the foundry will be offer Intel developments and innovations developed by the product teams. “The à la carte menu is wide open for the industry,” he said during a Q&A session. “Clearwater Forest, which I showed today, is a construct that was innovated by my Xeon team — how to do hybrid bonding, Intel three base dies, 18A top die, being able to solve a lot of the CoWoS/Foveros problems using EMIB and hybrid bonding. That will become a set of collaterals that will benefit the foundry. They’re going to sell that constructional opportunity as a better way to build AI chips. So clearly, I’m taking product group intellectual property and leveraging it on the foundry side.”

This may sound like business as usual for a foundry, but Intel for years waffled over its commitment to a foundry model. In fact, it was viewed as an IDM that only began selling wafers to customers as the cost of maintaining its own fab began to skyrocket out of country. Creating separation between the two is a fundamental shift, and it’s an essential one for building trust in a complex world of sometimes partners, sometimes competitors. In the past, the company closely guarded its IP, confining that to its own processors rather than unbundling it and selling it to potential competitors. With this new division, Intel potentially can generate profits both from customized chips that leverage technologies it develops internally for other applications, as well as its own chip business.

Gelsinger is adamant about this being a leading-edge technology company, not a supplier of all components required in systems. “I’m not going to solve 200mm supply chain issues,” he said. “And, by the way, there are not going to be more 200mm factories built, for the most part, outside of specialty like SiC. There are some crazy discussions, like when some Europeans say, ‘I don’t need a leading-edge factory in Europe. Give me a 40nm node.’ What a stupid statement. It takes 5 years to build a new 40nm node, which is already 20 years old, and to make it economically viable, we’re going to have to run it 30 more years. Move the designs to more modern nodes as opposed to expecting to build old factories that are already out of date.”

Put in perspective, Intel is basically taking the Apple approach to chipmaking. How it fares against the other two leading-edge foundries isn’t clear at this point. Until 22nm, which was 16/14nm for Samsung and TSMC, Intel was the front runner in process technology. Several nodes later, it was trailing both Samsung and TSMC. The company is on a mission to regain its leadership.

“Great industries have two or three strong players,” said Gelsinger. “TSMC is a great company, and we’re going to build a great foundry, as well. And we’re going to challenge each other to further greatness. They are the best company in terms of customer support, bar none, in the industry, and they do not have a legacy of leadership technologies. They implement technologies with great customer support. We have a deep legacy of leadership technologies across domains that we created…At our innovation conference in September, I showed three companies participating in chiplet standardization — Synopsys, Intel, and TSMC. With our test chips — with our test chips. The world wants chiplets across a range of suppliers. I think we’ll be doing some of that. I think they’re going to be doing some of that. And I want to make sure that our mutual customers have great choice and technology benefits.”

If successful, Intel Foundry may not be the lowest-cost chipmaker, as TSMC founder Morris Chang has intimated. But if it can win back some key customers, and attract a lot of new ones with better performance, higher energy efficiency, and more customization options, that may not matter. What does matter is execution on the roadmap, and the number of companies rallying around Intel appears to indicate that significant changes are afoot, and that the competitive landscape could be a lot more fluid than it was several years ago.

The post Intel, And Others, Inside appeared first on Semiconductor Engineering.

Why Chiplets Are So Critical In Automotive

Od: John Koon

Chiplets are gaining renewed attention in the automotive market, where increasing electrification and intense competition are forcing companies to accelerate their design and production schedules.

Electrification has lit a fire under some of the biggest and best-known carmakers, which are struggling to remain competitive in the face of very short market windows and constantly changing requirements. Unlike in the past, when carmakers typically ran on five- to seven-year design cycles, the latest technology in vehicles today may well be considered dated within several years. And if they cannot keep up, there is a whole new crop of startups producing cheap vehicles with the ability to update or change out features as quickly as a software update.

But software has speed, security, and reliability limitations, and being able to customize the hardware is where many automakers are now putting their efforts. This is where chiplets fit in, and the focus now is on how to build enough interoperability across large ecosystems to make this a plug-and-play market. The key factors to enable automotive chiplet interoperability include standardization, interconnect technologies, communication protocols, power and thermal management, security, testing, and ecosystem collaboration.

Similar to non-automotive applications at the board level, many design efforts are focusing on a die-to-die approach, which is driving a number of novel design considerations and tradeoffs. At the chip level, the interconnects between various processors, chips, memory, and I/O are becoming more complex due to increased design performance requirements, spurring a flurry of standards activities. Different interconnect and interface types have been proposed to serve varying purposes, while emerging chiplet technologies for dedicated functions — processors, memories, and I/Os, to name a few — are changing the approach to chip design.

“There is a realization by automotive OEMs that to control their own destiny, they’re going to have to control their own SoCs,” said David Fritz, vice president of virtual and hybrid systems at Siemens EDA. “However, they don’t understand how far along EDA has come since they were in college in 1982. Also, they believe they need to go to the latest process node, where a mask set is going to cost $100 million. They can’t afford that. They also don’t have access to talent because the talent pool is fairly small. With all that together comes the realization by the OEMs that to control their destiny, they need a technology that’s developed by others, but which can be combined however needed to have a unique differentiated product they are confident is future-proof for at least a few model years. Then it becomes economically viable. The only thing that fits the bill is chiplets.”

Chiplets can be optimized for specific functions, which can help automakers meet reliability, safety, security requirements with technology that has been proven across multiple vehicle designs. In addition, they can shorten time to market and ultimately reduce the cost of different features and functions.

Demand for chips has been on the rise for the past decade. According to Allied Market Research, global automotive chip demand will grow from $49.8 billion in 2021 to $121.3 billion by 2031. That growth will attract even more automotive chip innovation and investment, and chiplets are expected to be a big beneficiary.

But the marketplace for chiplets will take time to mature, and it will likely roll out in phases.  Initially, a vendor will provide different flavors of proprietary dies. Then, partners will work together to supply chiplets to support each other, as has already happened with some vendors. The final stage will be universally interoperable chiplets, as supported by UCIe or some other interconnect scheme.

Getting to the final stage will be the hardest, and it will require significant changes. To ensure interoperability, large enough portions of the automotive ecosystem and supply chain must come together, including hardware and software developers, foundries, OSATs, and material and equipment suppliers.

Momentum is building
On the plus side, not all of this is starting from scratch. At the board level, modules and sub-systems always have used onboard chip-to-chip interfaces, and they will continue to do so. Various chip and IP providers, including Cadence, Diode, Microchip, NXP, Renesas, Rambus, Infineon, Arm, and Synopsys, provide off-the-shelf interface chips or IP to create the interface silicon.

The Universal Chiplet Interconnect Express (UCIe) Consortium is the driving force behind the die-to-die, open interconnect standard. The group released its latest UCIe 1.1 specification in August 2023. Board members include Alibaba, AMD, Arm, ASE, Google Cloud, Intel, Meta, Microsoft, NVIDIA, Qualcomm, Samsung, and others. Industry partners are showing widespread support. AIB and Bunch of Wires (BoW) also have been proposed. In addition, Arm just released its own Chiplet System Architecture, along with an updated AMBA spec to standardize protocols for chiplets.

“Chiplets are already here, driven by necessity,” said Arif Khan, senior product marketing group director for design IP at Cadence. “The growing processor and SoC sizes are hitting the reticle limit and the diseconomies of scale. Incremental gains from process technology advances are lower than rising cost per transistor and design. The advances in packaging technology (2.5D/3D) and interface standardization at a die-to-die level, such as UCIe, will facilitate chiplet development.”

Nearly all of the chiplets used today are developed in-house by big chipmakers such as Intel, AMD, and Marvell, because they can tightly control the characteristics and behavior of those chiplets. But there is work underway at every level to open this market to more players. When that happens, smaller companies can begin capitalizing on what the high-profile trailblazers have accomplished so far, and innovating around those developments.

“Many of us believe the dream of having an off-the-shelf, interoperable chiplet portfolio will likely take years before becoming a reality,” said Guillaume Boillet, senior director strategic marketing at Arteris, adding that interoperability will emerge from groups of partners who are addressing the risk of incomplete specifications.

This also is raising the attractiveness of FPGAs and eFPGAs, which can provide a level of customization and updates for hardware in the field. “Chiplets are a real thing,” said Geoff Tate, CEO of Flex Logix. “Right now, a company building two or more chiplets can operate much more economically than a company building near-reticle-size die with almost no yield. Chiplet standardization still appears to be far away. Even UCIe is not a fixed standard yet. Not all agree on UCIe, bare die testing, and who owns the problem when the integrated package doesn’t work, etc. We do have some customers who use or are evaluating eFPGA for interfaces where standards are in flux like UCIe. They can implement silicon now and use the eFPGA to conform to standards changes later.”

There are other efforts supporting chiplets, as well, although for somewhat different reasons — notably, the rising cost of device scaling and the need to incorporate more features into chips, which are reticle-constrained at the most advanced nodes. But those efforts also pave the way for chiplets in automotive, and there is strong industry backing to make this all work. For example, under the sponsorship of SEMI, ASME, and three IEEE Societies, the new Heterogeneous Integration Roadmap (HIR) looks at various microelectronics design, materials, and packaging issues to come up with a roadmap for the semiconductor industry. Their current focus includes 2.5D, 3D-ICs, wafer-level packaging, integrated photonics, MEMS and sensors, and system-in-package (SiP), aerospace, automotive, and more.

At the recent Heterogeneous Integration Global Summit 2023, representatives from AMD, Applied Materials, ASE, Lam Research, MediaTek, Micron, Onto Innovation, TSMC, and others demonstrated strong support for chiplets. Another group that supports chiplets is the Chiplet Design Exchange (CDX) working group , which is part of the Open Domain Specific Architecture (ODSA) and the Open Compute Project Foundation (OCP). The Chiplet Design Exchange (CDX) charter focuses on the various characteristics of chiplet and chiplet integration, including electrical, mechanical, and thermal design exchange standards of the 2.5D stacked, and 3D Integrated Circuits (3D-ICs). Its representatives include Ansys, Applied Materials, Arm, Ayar Labs, Broadcom, Cadence, Intel, Macom, Marvell, Microsemi, NXP, Siemens EDA, Synopsys, and others.

“The things that automotive companies want in terms of what each chiplet does in terms of functionality is still in an upheaval mode,” Siemens’ Fritz noted. “One extreme has these problems, the other extreme has those problems. This is the sweet spot. This is what’s needed. And these are the types of companies that can go off and do that sort of work, and then you could put them together. Then this interoperability thing is not a big deal. The OEM can make it too complex by saying, ‘I have to handle that whole spectrum of possibilities.’ The alternative is that they could say, ‘It’s just like a high speed PCIe. If I want to communicate from one to the other, I already know how to do that. I’ve got drivers that are running my operating system. That would solve an awful lot of problems, and that’s where I believe it’s going to end up.”

One path to universal chiplet development?

Moving forward, chiplets are a focal point for both the automotive and chip industries, and that will involve everything from chiplet IP to memory interconnects and customization options and limitations.

For example, Renesas Electronics announced in November 2023 plans for its next-generation SoCs and MCUs. The company is targeting all major applications across the automotive digital domain, including advance information about its fifth-generation R-Car SoC for high-performance applications with advanced in-package chiplet integration technology, which is meant to provide automotive engineers greater flexibility to customize their designs.

Renesas noted that if more AI performance is required in Advanced Driver Assistance Systems (ADAS), engineers will have the capability to integrate AI accelerators into a single chip. The company said this roadmap comes after years of collaboration and discussions with Tier 1 and OEM customers, which have been clamoring for a way to accelerate development without compromising quality, including designing and verifying the software even before the hardware is available.

“Due to the ever increasing needs to increase compute on demand, and the increasing need for higher levels of autonomy in the cars of tomorrow, we see challenges in monolithic solutions scaling and providing the performance needs of the market in the upcoming years,” said Vasanth Waran, senior director for SoC Business & Strategies at Renesas. “Chiplets allows for the compute solutions to scale above and beyond the needs of the market.”

Renesas announced plans to create a chiplet-based product family specifically targeted at the automotive market starting in 2025.

Standard interfaces allow for SoC customization
It is not entirely clear how much overlap there will be between standard processors, which is where most chiplets are used today, and chiplets developed for automotive applications. But the underlying technologies and developments certainly will build off each other as this technology shifts into new markets.

“Whether it is an AI accelerator or ADAS automotive application, customers need standard interface IP blocks,” noted David Ridgeway, senior product manager, IP accelerated solutions group at Synopsys. “It is important to provide fully verified IP subsystems around their IP customization requirements to support the subsystem components used in the customers’ SoCs. When I say customization, you might not realize how customizable IP has become over the course of the last 10 to 20 years, on the PHY side as well as the controller side. For example, PCI Express has gone from PCIe Gen 3 to Gen 4 to Gen 5 and now Gen 6. The controller can be configured to support multiple bifurcation modes of smaller link widths, including one x16, two x8, or four x4. Our subsystem IP team works with customers to ensure all the customization requirements are met. For AI applications, signal and power integrity is extremely important to meet their performance requirements. Almost all our customers are seeking to push the envelope to achieve the highest memory bandwidth speeds possible so that their TPU can process many more transactions per second. Whenever the applications are cloud computing or artificial intelligence, customers want the fastest response rate possible.”

Fig 1: IP blocks including processor, digital, PHY, and verification help developers implement the entire SoC. Source: Synopsys

Fig 1: IP blocks including processor, digital, PHY, and verification help developers implement the entire SoC. Source: Synopsys

Optimizing PPA serves the ultimate goal of increasing efficiency, and this makes chiplets particularly attractive in automotive applications. When UCIe matures, it is expected to improve overall performance exponentially. For example, UCIe can deliver a shoreline bandwidth of 28 to 224 GB/s/mm in a standard package, and 165 to 1317 GB/s/mm in an advanced package. This represents a performance improvement of 20- to 100-fold. Bringing latency down from 20ns to 2ns represents a 10-fold improvement. Around 10 times greater power efficiency, at 0.5 pJ/b (standard package) and 0.25 pJ/b (advanced package), is another plus. The key is shortening the interface distance whenever possible.

To optimize chiplet designs, the UCIe Consortium provides some suggestions:

  • Careful planning consideration of architectural cut-lines (i.e. chiplet boundaries), optimizing for power, latency, silicon area, and IP reuse. For example, customizing one chiplet that needs a leading-edge process node while re-using other chiplets on older nodes may impact cost and time.
  • Thermal and mechanical packaging constraints need to be planned out for package thermal envelopes, hot spots, chiplet placements and I/O routing and breakouts.
  • Process nodes need to be carefully selected, particularly in the context of the associated power delivery scheme.
  • Test strategy for chiplets and packaged/assembled parts need to be developed up front to ensure silicon issues are caught at the chiplet-level testing phase rather than after they are assembled into a package.

Conclusion
The idea of standardizing die-to-die interfaces is catching on quickly but the path to get there will take time, effort, and a lot of collaboration among companies that rarely talk with each other. Building a vehicle takes one determine carmaker. Building a vehicle with chiplets requires an entire ecosystem that includes the developers, foundries, OSATs, and material and equipment suppliers to work together.

Automotive OEMs are experts at putting systems together and at finding innovative ways to cut costs. But it remains to seen how quickly and effectively they can build and leverage an ecosystem of interoperable chiplets to shrink design cycles, improve customization, and adapt to a world in which the leading edge technology may be outdated by the time it is fully designed, tested, and available to consumers.

— Ann Mutschler contributed to this report.

Related Reading
Automotive Relationships Shifting With Chiplets
As the automotive ecosystem balances the best approaches for designing in increasingly advanced features, how companies interact is still evolving.

The post Why Chiplets Are So Critical In Automotive appeared first on Semiconductor Engineering.

❌