FreshRSS

Zobrazení pro čtení

Jsou dostupné nové články, klikněte pro obnovení stránky.

Blog Review: Aug. 21

Cadence’s Reela Samuel explores the critical role of PCIe 6.0 equalization in maintaining signal integrity and solutions to mitigate verification challenges, such as creating checkers to verify all symbols of TS0, ensuring the correct functioning of scrambling, and monitoring phase and LTSSM state transitions.

Siemens’ John McMillan introduces an advanced packaging flow for Intel’s Embedded Multi-die Interconnect Bridge (EMIB) technology, including technical challenges, design methodologies, and the integration of EMIBs in system-level package designs.

Synopsys’ Dustin Todd checks out what’s next for the U.S. CHIPS and Science Act, including the establishment of the National Semiconductor Technology Center and the allocation of $13 billion for research and development efforts.

Keysight’s Roberto Piacentini Filho explores the challenges of managing the large design files and massive volumes of data involved in a modern chip design project, which can take up as much as a terabyte of disk space and involve hundreds of thousands of files.

Arm’s Sandeep Mistry shows how ML models developed for mobile computer vision applications and requiring tens to hundreds of millions of multiply-accumulate (MACs) operations per inference can be deployed to a modern microcontroller.

Ansys’ Aliyah Mallak explores an effort to manufacture biotech products in microgravity and how simulation helps ensure payloads containing delicate, temperature-sensitive spore samples and bioreactors make it safely to the International Space Station or low Earth orbit safely.

Micron Technology’s Amit Srivastava, ULVAC’s Brian Coppa, and SEMI’s Mark da Silva suggest tackling corporate sustainability goals with a bottom-up approach that leverages various sensing technologies, at the cleanroom, sub-fab, and facilities levels for both greenfield and brownfield device-making facilities, to enable predictive analytics.

And don’t miss the blogs featured in the latest Manufacturing, Packaging & Materials newsletter:

Amkor’s JeongMin Ju shows how to prevent critical failures in copper RDLs caused by overcurrent-induced fusing.

Synopsys’ Al Blais discusses curvilinear checking and fracture requirements for the MULTIGON era.

Lam Research’s Dempsey Deng compares the parasitic capacitance of a 6F2 honeycomb DRAM device to a 4F2 VCAT DRAM structure.

Brewer Science’s Jessica Albright covers debonding methods, thermal, topography, adhesion, and thickness variation.

SEMI’s John Cooney reviews a fireside chat between the President of SEMI Americas and the U.S. Under Secretary of State for Economic Growth, Energy, and the Environment on securing supply chains.

The post Blog Review: Aug. 21 appeared first on Semiconductor Engineering.

AI/ML’s Role In Design And Test Expands

The role of AI and ML in test keeps growing, providing significant time and money savings that often exceed initial expectations. But it doesn’t work in all cases, sometimes even disrupting well-tested process flows with questionable return on investment.

One of the big attractions of AI is its ability to apply analytics to large data sets that are otherwise limited by human capabilities. In the critical design-to-test realm, AI can address problems such as tool incompatibilities between the design set-up, simulation, and ATE test program, which typically slows debugging and development efforts. Some of the most time-consuming and costly aspects of design-to-test arise from incompatibilities between tools.

“During device bring-up and debug, complex software/hardware interactions can expose the need for domain knowledge from multiple teams or stakeholders, who may not be familiar with each other’s tools,” said Richard Fanning, lead software engineer at Teradyne. “Any time spent doing conversions or debugging differences in these set-ups is time wasted. Our toolset targets this exact problem by allowing all set-ups to use the same set of source files so everyone can be sure they are running the same thing.”

ML/AI can help keep design teams on track, as well. “As we drive down this technology curve, the analytics and the compute infrastructure that we have to bring to bear becomes increasingly more complex and you want to be able to make the right decision with a minimal amount of overkill,” said Ken Butler, senior director of business development in the ACS data analytics platform group at Advantest. “In some cases, we are customizing the test solution on a die-by-die type of basis.”

But despite the hype, not all tools work well in every circumstance. “AI has some great capabilities, but it’s really just a tool,” said Ron Press, senior director of technology enablement at Siemens Digital Industries Software, in a recent presentation at a MEPTEC event. “We still need engineering innovation. So sometimes people write about how AI is going to take away everybody’s job. I don’t see that at all. We have more complex designs and scaling in our designs. We need to get the same work done even faster by using AI as a tool to get us there.”

Speeding design to characterization to first silicon
In the face of ever-shrinking process windows and the lowest allowable defectivity rates, chipmakers continually are improving the design-to-test processes to ensure maximum efficiency during device bring-up and into high volume manufacturing. “Analytics in test operations is not a new thing. This industry has a history of analyzing test data and making product decisions for more than 30 years,” said Advantest’s Butler. “What is different now is that we’re moving to increasingly smaller geometries, advanced packaging technologies and chiplet-based designs. And that’s driving us to change the nature of the type of analytics that we do, both in terms of the software and the hardware infrastructure. But from a production test viewpoint, we’re still kind of in the early days of our journey with AI and test.”

Nonetheless, early adopters are building out the infrastructure needed for in-line compute and AI/ML modeling to support real-time inferencing in test cells. And because no one company has all the expertise needed in-house, partnerships and libraries of applications are being developed with tool-to-tool compatibility in mind.

“Protocol libraries provide out-of-the-box solutions for communicating common protocols. This reduces the development and debug effort for device communication,” said Teradyne’s Fanning. “We have seen situations where a test engineer has been tasked with talking to a new protocol interface, and saved significant time using this feature.”

In fact, data compatibility is a consistent theme, from design all the way through to the latest developments in ATE hardware and software. “Using the same test sequences between characterization and production has become key as the device complexity has increased exponentially,” explained Teradyne’s Fanning. “Partnerships with EDA tool and IP vendors is also key. We have worked extensively with industry leaders to ensure that the libraries and test files they output are formats our system can utilize directly. These tools also have device knowledge that our toolset does not. This is why the remote connect feature is key, because our partners can provide context-specific tools that are powerful during production debug. Being able to use these tools real-time without having to reproduce a setup or use case in a different environment has been a game changer.”

Serial scan test
But if it seems as if all the configuration changes are happening on the test side, it’s important to take stock of substantial changes on the approach to multi-core design for test.

Tradeoffs during the iterative process of design for test (DFT) have become so substantial in the case of multi-core products that a new approach has become necessary.

“If we look at the way a design is typically put together today, you have multiple cores that are going to be produced at different times,” said Siemens’ Press. “You need to have an idea of how many I/O pins you need to get your scan channels, the deep serial memory from the tester that’s going to be feeding through your I/O pins to this core. So I have a bunch of variables I need to trade off. I have the number of pins going to the core, the pattern size, and the complexity of the core. Then I’ll try to figure out what’s the best combination of cores to test together in what is called hierarchical DFT. But as these designs get more complex, with upwards of 2,500 cores, that’s a lot of tradeoffs to figure out.”

Press noted that applying AI with the same architecture can provide a 20% to 30% higher efficiency, but an improved methodology based on packetized scan test (see figure 1) actually makes more sense.


Fig. 1: Advantages to the serial scan network (SSN) approach. Source: Siemens

“Instead of having tester channels feeding into the scan channels that go to each core, you have a packetized bus and packets of data that feed through all the cores. Then you instruct the cores when their packet information is going to be available. By doing this, you don’t have as many variables you need to trade off,” he said. At the core level, each core can be optimized for any number of scan channels and patterns, and the I/O pin count is no longer a variable in the calculation. “Then, when you put it into this final chip, it deliver from the packets the amount of data you need for that core, that can work with any size serial bus, in what is called a serial scan network (SSN).”

Some of the results reported by Siemens EDA customers (see figure 2) highlight both supervised and unsupervised machine learning implementation for improvements in diagnosis resolution and failure analysis. DFT productivity was boosted by 5 to 10X using the serial scan network methodology.


Fig. 2: Realized benefits using machine learning and the serial scan network approach. Source: Siemens

What slows down AI implementation in HVM?
In the transition from design to testing of a device, the application of machine learning algorithms can enable a number of advantages, from better pairing of chiplet performance for use in an advanced package to test time reduction. For example, only a subset of high-performing devices may require burn-in.

“You can identify scratches on wafers, and then bin out the dies surrounding those scratches automatically within wafer sort,” said Michael Schuldenfrei, fellow at NI/Emerson Test & Measurement. “So AI and ML all sounds like a really great idea, and there are many applications where it makes sense to use AI. The big question is, why isn’t it really happening frequently and at-scale? The answer to that goes into the complexity of building and deploying these solutions.”

Schuldenfrei summarized four key steps in ML’s lifecycle, each with its own challenges. In the first phase, the training, engineering teams use data to understand a particular issue and then build a model that can be used to predict an outcome associated with that issue. Once the model is validated and the team wants to deploy it in the production environment, it needs to be integrated with the existing equipment, such as a tester or manufacturing execution system (MES). Models also mature and evolve over time, requiring frequent validation of the data going into the model and checking to see that the model is functioning as expected. Models also must adapt, requiring redeployment, learning, acting, validating and adapting, in a continuous circle.

“That eats up a lot of time for the data scientists who are charged with deploying all these new AI-based solutions in their organizations. Time is also wasted in the beginning when they are trying to access the right data, organizing it, connecting it all together, making sense of it, and extracting features from it that actually make sense,” said Schuldenfrei.

Further difficulties are introduced in a distributed semiconductor manufacturing environment in which many different test houses are situated in various locations around the globe. “By the time you finish implementing the ML solution, your model is stale and your product is probably no longer bleeding edge so it has lost its actionability, when the model needs to make a decision that actually impacts either the binning or the processing of that particular device,” said Schuldenfrei. “So actually deploying ML-based solutions in a production environment with high-volume semiconductor test is very far from trivial.”

He cited a 2014 Google article that stated how the ML code development part of the process is both the smallest and easiest part of the whole exercise, [1] whereas the various aspects of building infrastructure, data collection, feature extraction, data verification, and managing model deployments are the most challenging parts.

Changes from design through test ripple through the ecosystem. “People who work in EDA put lots of effort into design rule checking (DRC), meaning we’re checking that the work we’ve done and the design structure are safe to move forward because we didn’t mess anything up in the process,” said Siemens’ Press. “That’s really important with AI — what we call verifiability. If we have some type of AI running and giving us a result, we have to make sure that result is safe. This really affects the people doing the design, the DFT group and the people in test engineering that have to take these patterns and apply them.”

There are a multitude of ML-based applications for improving test operations. Advantest’s Butler highlighted some of the apps customers are pursuing most often, including search time reduction, shift left testing, test time reduction, and chiplet pairing (see figure 3).

“For minimum voltage, maximum frequency, or trim tests, you tend to set a lower limit and an upper limit for your search, and then you’re going to search across there in order to be able to find your minimum voltage for this particular device,” he said. “Those limits are set based on process split, and they may be fairly wide. But if you have analytics that you can bring to bear, then the AI- or ML-type techniques can basically tell you where this die lies on the process spectrum. Perhaps it was fed forward from an earlier insertion, and perhaps you combine it with what you’re doing at the current insertion. That kind of inference can help you narrow the search limits and speed up that test. A lot of people are very interested in this application, and some folks are doing it in production to reduce search time for test time-intensive tests.”


Fig. 3: Opportunities for real-time and/or post-test improvements to pair or bin devices, improve yield, throughput, reliability or cost using the ACS platform. Source: Advantest

“The idea behind shift left is perhaps I have a very expensive test insertion downstream or a high package cost,” Butler said. “If my yield is not where I want it to be, then I can use analytics at earlier insertions to be able to try to predict which devices are likely to fail at the later insertion by doing analysis at an earlier insertion, and then downgrade or scrap those die in order to optimize downstream test insertions, raising the yield and lowering overall cost. Test time reduction is very simply the addition or removal of test content, skipping tests to reduce cost. Or you might want to add test content for yield improvement,” said Butler.

“If I have a multi-tiered device, and it’s not going to pass bin 1 criteria – but maybe it’s bin 2 if I add some additional content — then people may be looking at analytics to try to make those decisions. Finally, two things go together in my mind, this idea of chiplet designs and smart pairing. So the classic example is a processor die with a stack of high bandwidth memory on top of it. Perhaps I’m interested in high performance in some applications and low power in others. I want to be able to match the content and classify die as they’re coming through the test operation, and then downstream do pick-and-place and put them together in such a way that I maximize the yield for multiple streams of data. Similar kinds of things apply for achieving a low power footprint and carbon footprint.”

Generative AI
The question that inevitably comes up when discussing the role of AI in semiconductors is whether or not large language models like ChatGPT can prove useful to engineers working in fabs. Early work shows some promise.

“For example, you can ask the system to build an outlier detection model for you that looks for parts that are five sigma away from the center line, saying ‘Please create the script for me,’ and the system will create the script. These are the kinds of automated, generative AI-based solutions that we’re already playing with,” says Schuldenfrei. “But from everything I’ve seen so far, there is still quite a lot of work to be done to get these systems to provide outputs with high enough quality. At the moment, the amount of human interaction that is needed afterward to fix problems with the algorithms or models that generative AI is producing is still quite significant.”

A lingering question is how to access the test programs needed to train the new test programs when everyone is protecting important test IP? “Most people value their test IP and don’t necessarily want to set up guardrails around the training and utilization processes,” Butler said. “So finding a way to accelerate the overall process of developing test programs while protecting IP is the challenge. It’s clear this kind of technology is going to be brought to bear, just like we already see in the software development process.”

Failure analysis
Failure analysis is typically a costly and time-consuming endeavor for fabs because it requires a trip back in time to gather wafer processing, assembly, and packaging data specific to a particular failed device, known as a returned material authorization (RMA). Physical failure analysis is performed in an FA lab, using a variety of tools to trace the root cause of the failure.

While scan diagnostic data has been used for decades, a newer approach involves pairing a digital twin with scan diagnostics data to find the root cause of failures.

“Within test, we have a digital twin that does root cause deconvolution based on scan failure diagnosis. So instead of having to look at the physical device and spend time trying to figure out the root cause, since we have scan, we have millions and millions of virtual sample points,” said Siemens’ Press. “We can reverse-engineer what we did to create the patterns and figure out where the mis-compare happened within the scan cells deep within the design. Using YieldInsight and unsupervised machine learning with training on a bunch of data, we can very quickly pinpoint the fail locations. This allows us to run thousands, or tens of thousands fail diagnoses in a short period of time, giving us the opportunity to identify the systematic yield limiters.”

Yet another approach that is gaining steam is using on-die monitors to access specific performance information in lieu of physical FA. “What is needed is deep data from inside the package to monitor performance and reliability continuously, which is what we provide,” said Alex Burlak, vice president of test and analytics at proteanTecs. “For example, if the suspected failure is from the chiplet interconnect, we can help the analysis using deep data coming from on-chip agents instead of taking the device out of context and into the lab (where you may or may not be able to reproduce the problem). Even more, the ability to send back data and not the device can in many cases pinpoint the problem, saving the expensive RMA and failure analysis procedure.”

Conclusion
The enthusiasm around AI and machine learning is being met by robust infrastructure changes in the ATE community to accommodate the need for real-time inferencing of test data and test optimization for higher yield, higher throughput, and chiplet classifications for multi-chiplet packages. For multi-core designs, packetized test, commercialized as an SSN methodology, provides a more flexible approach to optimizing each core for the number of scan chains, patterns and bus width needs of each core in a device.

The number of testing applications that can benefit from AI continues to rise, including test time reduction, Vmin/Fmax search reduction, shift left, smart pairing of chiplets, and overall power reduction. New developments like identical source files for all setups across design, characterization, and test help speed the critical debug and development stage for new products.

Reference

  1. https://proceedings.neurips.cc/paper_files/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf

The post AI/ML’s Role In Design And Test Expands appeared first on Semiconductor Engineering.

Ensure Reliability In Automotive ICs By Reducing Thermal Effects

Od: Lee Wang

In the relentless pursuit of performance and miniaturization, the semiconductor industry has increasingly turned to 3D integrated circuits (3D-ICs) as a cutting-edge solution. Stacking dies in a 3D assembly offers numerous benefits, including enhanced performance, reduced power consumption, and more efficient use of space. However, this advanced technology also introduces significant thermal dissipation challenges that can impact the electrical behavior, reliability, performance, and lifespan of the chips (figure 1). For automotive applications, where safety and reliability are paramount, managing these thermal effects is of utmost importance.

Fig. 1: Illustration of a 3D-IC with heat dissipation.

3D-ICs have become particularly attractive for safety-critical devices like automotive sensors. Advanced driver-assistance systems (ADAS) and autonomous vehicles (AVs) rely on these compact, high-performance chips to process vast amounts of sensor data in real time. Effective thermal management in these devices is a top priority to ensure that they function reliably under various operating conditions.

The thermal challenges of 3D-ICs in automotive applications

The stacked configuration of 3D-ICs inherently leads to complex thermal dynamics. In traditional 2D designs, heat dissipation occurs across a single plane, making it relatively straightforward to manage. However, in 3D-ICs, multiple active layers generate heat, creating significant thermal gradients and hotspots. These thermal issues can adversely affect device performance and reliability, which is particularly critical in automotive applications where components must operate reliably under extreme temperatures and harsh conditions.

These thermal effects in automotive 3D-ICs can impact the electrical behavior of the circuits, causing timing errors, increased leakage currents, and potential device failure. Therefore, accurate and comprehensive thermal analysis throughout the design flow is essential to ensure the reliability and performance of automotive ICs.

The importance of early and continuous thermal analysis

Traditionally, thermal analysis has been performed at the package and system levels, often as a separate process from IC design. However, with the advent of 3D-ICs, this approach is no longer sufficient.

To address the thermal challenges of 3D-ICs for automotive applications, it is crucial to incorporate die-level thermal analysis early in the design process and continue it throughout the design flow (figure 2). Early-stage thermal analysis can help identify potential hotspots and thermal bottlenecks before they become critical issues, enabling designers to make informed decisions about chiplet placement, power distribution, and cooling strategies. These early decisions reduce the risks of thermal-induced failures, improving the reliability of 3D automotive ICs.

Fig. 2: Die-level detailed thermal analysis using accurate package and boundary conditions should be fully integrated into the ASIC design flow to allow for fast thermal exploration.

Early package design, floorplanning and thermal feasibility analysis

During the initial package design and floorplanning stage, designers can use high-level power estimates and simplified models to perform thermal feasibility studies. These early analyses help identify configurations that are likely to cause thermal problems, allowing designers to rule out problematic designs before investing significant time and resources in detailed implementation.

Fig. 3: Thermal analysis as part of the package design, floorplanning and implementation flows.

For example, thermal analysis can reveal issues such as overlapping heat sources in stacked dies or insufficient cooling paths. By identifying these problems early, designers can explore alternative floorplans and adjust power distribution to mitigate thermal risks. This proactive approach reduces the likelihood of encountering critical thermal issues late in the design process, thereby shortening the overall design cycle.

Iterative thermal analysis throughout design refinement

As the design progresses and more detailed information becomes available, thermal analysis should be performed iteratively to refine the thermal model and verify that the design remains within acceptable thermal limits. At each stage of design refinement, additional details such as power maps, layout geometries and their material properties can be incorporated into the thermal model to improve accuracy.

This iterative approach lets designers continuously monitor and address thermal issues, ensuring that the design evolves in a thermally aware manner. By integrating thermal analysis with other design verification tasks, such as timing and power analysis, designers can achieve a holistic view of the design’s performance and reliability.

A robust thermal analysis tool should support various stages of the design process, providing value from initial concept to final signoff:

  1. Early design planning: At the conceptual stage, designers can apply high-level power estimates to explore the thermal impact of different design options. This includes decisions related to 3D partitioning, die assembly, block and TSV floorplan, interface layer design, and package selection. By identifying potential thermal issues early, designers can make informed decisions that avoid costly redesigns later.
  2. Detailed design and implementation: As designs become more detailed, thermal analysis should be used to verify that the design stays within its thermal budget. This involves analyzing the maturing package and die layout representations to account for their impact on thermally sensitive electrical circuits. Fine-grained power maps are crucial at this stage to capture hotspot effects accurately.
  3. Design signoff: Before finalizing the design, it is essential to perform comprehensive thermal verification. This ensures that the design meets all thermal constraints and reliability requirements. Automated constraints checking and detailed reporting can expedite this process, providing designers with clear insights into any remaining thermal issues.
  4. Connection to package-system analysis: Models from IC-level thermal analysis can be used in thermal analysis of the package and system. The integration lets designers build a streamlined flow through the entire development process of a 3D electronic product.

Tools and techniques for accurate thermal analysis

To effectively manage thermal challenges in automotive ICs, designers need advanced tools and techniques that can provide accurate and fast thermal analysis throughout the design flow. Modern thermal analysis tools are equipped with capabilities to handle the complexity of 3D-IC designs, from early feasibility studies to final signoff.

High-fidelity thermal models

Accurate thermal analysis requires high-fidelity thermal models that capture the intricate details of the 3D-IC assembly. These models should account for non-uniform material properties, fine-grained power distributions, and the thermal impact of through-silicon vias (TSVs) and other 3D features. Advanced tools can generate detailed thermal models based on the actual design geometries, providing a realistic representation of heat flow and temperature distribution.

For instance, tools like Calibre 3DThermal embeds an optimized custom 3D solver from Simcenter Flotherm to perform precise thermal analysis down to the nanometer scale. By leveraging detailed layer information and accurate boundary conditions, these tools can produce reliable thermal models that reflect the true thermal behavior of the design.

Automation and results viewing

Automation is a key feature of modern thermal analysis tools, enabling designers to perform complex analyses without requiring deep expertise in thermal engineering. An effective thermal analysis tool must offer advanced automation to facilitate use by non-experts. Key automation features include:

  1. Optimized gridding: Automatically applying finer grids in critical areas of the model to ensure high resolution where needed, while using coarser grids elsewhere for efficiency.
  2. Time step automation: In transient analysis, smaller time steps can be automatically generated during power transitions to capture key impacts accurately.
  3. Equivalent thermal properties: Automatically reducing model complexity while maintaining accuracy by applying different bin sizes for critical (hotspot) vs non-critical regions when generating equivalent thermal properties.
  4. Power map compression: Using adaptive bin sizes to compress very large power maps to improve tool performance.
  1. Automated reporting: Generating summary reports that highlight key results for easy review and decision-making (figure 4).

Fig. 4: Ways to view thermal analysis results.

Automated thermal analysis tools can also integrate seamlessly with other design verification and implementation tools, providing a unified environment for managing thermal, electrical, and mechanical constraints. This integration ensures that thermal considerations are consistently addressed throughout the design flow, from initial feasibility analysis to final tape-out and even connecting with package-level analysis tools.

Real-world application

The practical benefits of integrated thermal analysis solutions are evident in real-world applications. For instance, a leading research organization, CEA, utilized an advanced thermal analysis tool from Siemens EDA to study the thermal performance of their 3DNoC demonstrator. The high-fidelity thermal model they developed showed a worst-case difference of just 3.75% and an average difference within 2% between simulation and measured data, demonstrating the accuracy and reliability of the tool (figure 5).

Fig. 5: Correlation of simulation versus measured results.

The path forward for automotive 3D-IC thermal management

As the automotive industry continues to embrace advanced technologies, the importance of accurate thermal analysis throughout the design flow of 3D-ICs cannot be overstated. By incorporating thermal analysis early in the design process and iteratively refining thermal models, designers can mitigate thermal risks, reduce design time, and enhance chip reliability.

Advanced thermal analysis tools that integrate seamlessly with the broader design environment are essential for achieving these goals. These tools enable designers to perform high-fidelity thermal analysis, automate complex tasks, and ensure that thermal considerations are addressed consistently from package design, through implementation to signoff.

By embracing these practices, designers can unlock the full potential of 3D-IC technology, delivering innovative, high-performance devices that meet the demands of today’s increasingly complex automotive applications.

For more information about die-level 3D-IC thermal analysis, read Conquer 3DIC thermal impacts with Calibre 3DThermal.

The post Ensure Reliability In Automotive ICs By Reducing Thermal Effects appeared first on Semiconductor Engineering.

Chip Industry Week in Review

Okinawa Institute of Science and Technology proposed a new EUV litho technology using only four reflective mirrors and a new method of illumination optics that it claims will use 1/10 the power and cost half as much as existing EUV technology from ASML.

Applied Materials may not receive expected U.S. funding to build a $4 billion research facility in Sunnyvale, CA, due to internal government disagreements over how to fund chip R&D, according to Bloomberg.

SEMI published a position paper this week cautioning the European Union against imposing additional export controls to allow companies, encouraging them to  be “as free as possible in their investment decisions to avoid losing their agility and relevance across global markets.” SEMI’s recommendations on outbound investments are in response to the European Economic Security Strategy and emphasize the need for a transparent and predictable regulatory framework.

The U.S. may restrict China’s access to HBM chips and the equipment needed to make them, reports Bloomberg. Today those chips are manufactured by two Korean-based companies, Samsung and SK hynix, but U.S.-based Micron expects to begin shipping 12-high stacks of HBM3E in 2025, and is currently working on HBM4.

Synopsys executive chair and founder Dr. Aart de Geus was named the winner of the Semiconductor Industry Association’s Robert N. Noyce Award. De Geus was selected due to his contributions to EDA technology over a career spanning more than four decades.

The top three foundries plan to implement high-NA EUV lithography as early as 2025 for the 18 angstrom generation, but the replacement of single exposure high-NA (0.55) over double patterning with standard EUV (NA = 0.33) depends on whether it provides better results at a reasonable cost per wafer.

Quick links to more news:

Global
In-Depth
Market Reports and Earnings
Education and Training
Security
Product News
Research
Events and Further Reading


Global

Belgium-based Imec released part 2 of its chiplets series, addressing testing strategies and standardization efforts, as well as guidelines and research “towards efficient ESD protection strategies for advanced 3D systems-on-chip.”

Also in Belgium, BelGan, maker of GaN chips, filed for bankruptcy according to the Brussels Times.

TSMC‘s Dresden, Germany, plant will break ground this month.

The UK will dole out more than £100 million (~US $128 million) in funding to develop five new quantum research hubs in Glasgow, Edinburgh, Birmingham, Oxford, and London.

MassPhoton is opening Hong Kong‘s first ultra-high vacuum GaN epitaxial wafer pilot line and will establish a GaN research center.

Infineon completed the sale of its manufacturing sites in the Philippines and South Korea to ASE.

Israel-based RAAAM Memory Technologies received a €5.25 million grant from the European Innovation Council (EIC) to support the development and commercialization of its innovative memory solutions. This funding will enable RAAAM to advance its research in high-performance and energy-efficient memory technologies, accelerating their integration into various applications and markets.


In-Depth

Semiconductor Engineering published its Automotive, Security and Pervasive Computing newsletter this week, featuring these top stories and video:

And:


Market Reports and Earnings

The semiconductor equipment industry is on a positive trajectory in 2024, with moderate revenue growth observed in Q2 after a subdued Q1, according to a new report from Yole Group. Wafer Fab Equipment revenue is projected to grow by 1.3% year-on-year, despite a 12% drop in Q1. Test equipment lead times are normalizing, improving order conditions. Key areas driving growth include memory and logic capital expenditures and high-bandwidth memory demand.

Worldwide silicon wafer shipments increased by 7% in Q2 2024, according to SEMI‘s latest report. This growth is attributed to robust demand from multiple semiconductor sectors, driven by advancements in AI, 5G, and automotive technologies.

The RF GaN market is projected to grow to US $2 billion by 2029, a 10% CAGR, according to Yole Group.

Counterpoint released their Q2 smartphone top 10 report.

Renesas completed their acquisition of EDA firm Altium, best known for its EDA platform and freeware CircuitMaker package.

It’s earnings season and here are recently released financials in the chip industry:

AMD  Advantest   Amkor   Ansys  Arteris   Arm   ASE   ASM   ASML
Cadence  IBM   Intel   Lam Research   Lattice   Nordson   NXP   Onsemi 
Qualcomm   Rambus  Samsung    SK Hynix   STMicro   Teradyne    TI  
Tower  TSMC    UMC  Western Digital

Industry stock price impacts are here.


Education and Training

Rochester Institute of Technology is leading a new pilot program to prepare community college students in areas such as cleanroom operations, new materials, simulation, and testing processes, with the intent of eventual transfer into RIT’s microelectronic engineering program.

Purdue University inked a deal with three research institutions — University of Piraeus, Technical University of Crete, and King’s College London —to develop joint research programs for semiconductors, AI and other critical technology fields.

The European Chips Skills Academy formed the Educational Leaders Board to help bridge the talent gap in Europe’s microelectronics sector.  The Board includes representatives from universities, vocational training providers, educators and research institutions who collaborate on strategic initiatives to strengthen university networks and build academic expertise through ECSA training programs.


Security

The Cybersecurity and Infrastructure Security Agency (CISA) is encouraging Apple users to review and apply this week’s recent security updates.

Microsoft Azure experienced a nearly 10 hour DDoS attack this week, leading to global service disruption for many customers.  “While the initial trigger event was a Distributed Denial-of-Service (DDoS) attack, which activated our DDoS protection mechanisms, initial investigations suggest that an error in the implementation of our defenses amplified the impact of the attack rather than mitigating it,” stated Microsoft in a release.

NIST published:

  • “Recommendations For Increasing U.S. Participation and Leadership in Standards Development,” a report outlining cybersecurity recommendations and mitigation strategies.
  • Final guidance documents and software to help improve the “safety, security and trustworthiness of AI systems.”
  • Cloud Computing Forensic Reference Architecture guide.

Delta Air Lines plans to seek damages after losing $500 million in lost revenue due to security company CrowdStrike‘s software update debacle.  And shareholders are also angry.

Recent security research:

  • Physically Secure Logic Locking With Nanomagnet Logic (UT Dallas)
  • WBP: Training-time Backdoor Attacks through HW-based Weight Bit Poisoning (UCF)
  • S-Tune: SOT-MTJ Manufacturing Parameters Tuning for Secure Next Generation of Computing ( U. of Arizona, UCF)
  • Diffie Hellman Picture Show: Key Exchange Stories from Commercial VoWiFi Deployments (CISPA, SBA Research, U. of Vienna)

Product News

Lam Research introduced a new version of its cryogenic etch technology designed to enhance the manufacturing of 3D NAND for AI applications. This technology allows for the precise etching of high aspect ratio features, crucial for creating 1,000-layer 3D NAND.


Fig.1: 3D NAND etch. Source: Lam Research

Alphawave Semi launched its Universal Chiplet Interconnect Express Die-to-Die IP. The subsystem offers 8 Tbps/mm bandwidth density and supports operation at 24 Gbps for D2D connectivity.

Infineon introduced a new MCU series for industrial and consumer motor controls, as well as power conversion system applications. The company also unveiled its new GoolGaN Drive product family of integrated single switches and half-bridges with integrated drivers.

Rambus released its DDR5 Client Clock Driver for next-gen, high-performance desktops and notebooks. The chips include Gen1 to Gen4 RCDs, power management ICs, Serial Presence Detect Hubs, and temperature sensors for leading-edge servers.

SK hynix introduced its new GDDR7 graphics DRAM. The product has an operating speed of 32Gbps, can process 1.5TB of data per second and has a 50% power efficiency improvement compared to the previous generation.

Intel launched its new Lunar Lake Ultra processors. The long awaited chips will be included in more than 80 laptop designs and has more than 40 NPU tera operations per second as well as over 60 GPU TOPS delivering more than 100 platform TOPS.

Brewer Science achieved recertification as a Certified B Corporation, reaffirming its commitment to sustainable and ethical business practices.

Panasonic adopted Siemens’ Teamcenter X cloud product lifecycle management solution, citing Teamcenter X’s Mendix low-code platform, improved operational efficiency and flexibility for its choice.

Keysight validated its 5G NR FR1 1024-QAM demodulation test cases for the first time. The 5G NR radio access technology supports eMBB and was validated on the 3GPP TS 38.521-4 test specification.


Research

In a 47-page deep-dive report, the Center for Security and Emerging Technology delved into all of the scientific breakthroughs from 1980 to present that brought EUV lithography to commercialization, including lessons learned for the next emerging technologies.

Researchers at the Paul Scherrer Institute developed a high-performance X-ray tomography technique using burst ptychography, achieving a resolution of 4nm. This method allows for non-destructive imaging of integrated circuits, providing detailed views of nanostructures in materials like silicon and metals.

MIT signed a four-year agreement with the Novo Nordisk Foundation Quantum Computing Programme at University of Copenhagen, focused on accelerating quantum computing hardware research.

MIT’s Research Laboratory of Electronics (RLE) developed a mechanically flexible wafer-scale integrated photonics fabrication platform. This enables the creation of flexible photonic circuits that maintain high performance while being bendable and stretchable. It offers significant potential for integrating photonic circuits into various flexible substrate applications in wearable technology, medical devices, and flexible electronics.

The Naval Research Lab identified a new class of semiconductor nanocrystals with bright ground-state excitons, emphasizing an important advancement in optoelectronics.

Researchers from National University of Singapore developed a novel method, known as tension-driven CHARM3D,  to fabricate 3D self-healing circuits, enabling the 3D printing of free-standing metallic structures without the need for support materials and external pressure.

Find more research in our Technical Papers library.


Events and Further Reading

Find upcoming chip industry events here, including:

Event Date Location
Atomic Layer Deposition (ALD 2024) Aug 4 – 7 Helsinki
Flash Memory Summit Aug 6 – 8 Santa Clara, CA
USENIX Security Symposium Aug 14 – 16 Philadelphia, PA
SPIE Optics + Photonics 2024 Aug 18 – 22 San Diego, CA
Cadence Cloud Tech Day Aug 20 San Jose, CA
Hot Chips 2024 Aug 25- 27 Stanford University/ Hybrid
Optica Online Industry Meeting: PIC Manufacturing, Packaging and Testing (imec) Aug 27 Online
SEMICON Taiwan Sep 4 -6 Taipei
DVCON Taiwan Sep 10 – 11 Hsinchu
AI HW and Edge AI Summit Sep 9 – 12 San Jose, CA
GSA Executive Forum Sep 26 Menlo Park, CA
SPIE Photomask Technology + EUVL Sep 29 – Oct 3 Monterey, CA
Strategic Materials Conference: SMC 2024 Sep 30 – Oct 2 San Jose, CA
Find All Upcoming Events Here

Upcoming webinars are here, including topics such as quantum safe cryptography, analytics for high-volume manufacturing, and mastering EMC simulations for electronic design.

Find Semiconductor Engineering’s latest newsletters here:

Automotive, Security and Pervasive Computing
Systems and Design
Low Power-High Performance
Test, Measurement and Analytics
Manufacturing, Packaging and Materials

 

The post Chip Industry Week in Review appeared first on Semiconductor Engineering.

Chip Industry Week in Review

Okinawa Institute of Science and Technology proposed a new EUV litho technology using only four reflective mirrors and a new method of illumination optics that it claims will use 1/10 the power and cost half as much as existing EUV technology from ASML.

Applied Materials may not receive expected U.S. funding to build a $4 billion research facility in Sunnyvale, CA, due to internal government disagreements over how to fund chip R&D, according to Bloomberg.

SEMI published a position paper this week cautioning the European Union against imposing additional export controls to allow companies, encouraging them to  be “as free as possible in their investment decisions to avoid losing their agility and relevance across global markets.” SEMI’s recommendations on outbound investments are in response to the European Economic Security Strategy and emphasize the need for a transparent and predictable regulatory framework.

The U.S. may restrict China’s access to HBM chips and the equipment needed to make them, reports Bloomberg. Today those chips are manufactured by two Korean-based companies, Samsung and SK hynix, but U.S.-based Micron expects to begin shipping 12-high stacks of HBM3E in 2025, and is currently working on HBM4.

Synopsys executive chair and founder Dr. Aart de Geus was named the winner of the Semiconductor Industry Association’s Robert N. Noyce Award. De Geus was selected due to his contributions to EDA technology over a career spanning more than four decades.

The top three foundries plan to implement high-NA EUV lithography as early as 2025 for the 18 angstrom generation, but the replacement of single exposure high-NA (0.55) over double patterning with standard EUV (NA = 0.33) depends on whether it provides better results at a reasonable cost per wafer.

Quick links to more news:

Global
In-Depth
Market Reports and Earnings
Education and Training
Security
Product News
Research
Events and Further Reading


Global

Belgium-based Imec released part 2 of its chiplets series, addressing testing strategies and standardization efforts, as well as guidelines and research “towards efficient ESD protection strategies for advanced 3D systems-on-chip.”

Also in Belgium, BelGan, maker of GaN chips, filed for bankruptcy according to the Brussels Times.

TSMC‘s Dresden, Germany, plant will break ground this month.

The UK will dole out more than £100 million (~US $128 million) in funding to develop five new quantum research hubs in Glasgow, Edinburgh, Birmingham, Oxford, and London.

MassPhoton is opening Hong Kong‘s first ultra-high vacuum GaN epitaxial wafer pilot line and will establish a GaN research center.

Infineon completed the sale of its manufacturing sites in the Philippines and South Korea to ASE.

Israel-based RAAAM Memory Technologies received a €5.25 million grant from the European Innovation Council (EIC) to support the development and commercialization of its innovative memory solutions. This funding will enable RAAAM to advance its research in high-performance and energy-efficient memory technologies, accelerating their integration into various applications and markets.


In-Depth

Semiconductor Engineering published its Automotive, Security and Pervasive Computing newsletter this week, featuring these top stories and video:

And:


Market Reports and Earnings

The semiconductor equipment industry is on a positive trajectory in 2024, with moderate revenue growth observed in Q2 after a subdued Q1, according to a new report from Yole Group. Wafer Fab Equipment revenue is projected to grow by 1.3% year-on-year, despite a 12% drop in Q1. Test equipment lead times are normalizing, improving order conditions. Key areas driving growth include memory and logic capital expenditures and high-bandwidth memory demand.

Worldwide silicon wafer shipments increased by 7% in Q2 2024, according to SEMI‘s latest report. This growth is attributed to robust demand from multiple semiconductor sectors, driven by advancements in AI, 5G, and automotive technologies.

The RF GaN market is projected to grow to US $2 billion by 2029, a 10% CAGR, according to Yole Group.

Counterpoint released their Q2 smartphone top 10 report.

Renesas completed their acquisition of EDA firm Altium, best known for its EDA platform and freeware CircuitMaker package.

It’s earnings season and here are recently released financials in the chip industry:

AMD  Advantest   Amkor   Ansys  Arteris   Arm   ASE   ASM   ASML
Cadence  IBM   Intel   Lam Research   Lattice   Nordson   NXP   Onsemi 
Qualcomm   Rambus  Samsung    SK Hynix   STMicro   Teradyne    TI  
Tower  TSMC    UMC  Western Digital

Industry stock price impacts are here.


Education and Training

Rochester Institute of Technology is leading a new pilot program to prepare community college students in areas such as cleanroom operations, new materials, simulation, and testing processes, with the intent of eventual transfer into RIT’s microelectronic engineering program.

Purdue University inked a deal with three research institutions — University of Piraeus, Technical University of Crete, and King’s College London —to develop joint research programs for semiconductors, AI and other critical technology fields.

The European Chips Skills Academy formed the Educational Leaders Board to help bridge the talent gap in Europe’s microelectronics sector.  The Board includes representatives from universities, vocational training providers, educators and research institutions who collaborate on strategic initiatives to strengthen university networks and build academic expertise through ECSA training programs.


Security

The Cybersecurity and Infrastructure Security Agency (CISA) is encouraging Apple users to review and apply this week’s recent security updates.

Microsoft Azure experienced a nearly 10 hour DDoS attack this week, leading to global service disruption for many customers.  “While the initial trigger event was a Distributed Denial-of-Service (DDoS) attack, which activated our DDoS protection mechanisms, initial investigations suggest that an error in the implementation of our defenses amplified the impact of the attack rather than mitigating it,” stated Microsoft in a release.

NIST published:

  • “Recommendations For Increasing U.S. Participation and Leadership in Standards Development,” a report outlining cybersecurity recommendations and mitigation strategies.
  • Final guidance documents and software to help improve the “safety, security and trustworthiness of AI systems.”
  • Cloud Computing Forensic Reference Architecture guide.

Delta Air Lines plans to seek damages after losing $500 million in lost revenue due to security company CrowdStrike‘s software update debacle.  And shareholders are also angry.

Recent security research:

  • Physically Secure Logic Locking With Nanomagnet Logic (UT Dallas)
  • WBP: Training-time Backdoor Attacks through HW-based Weight Bit Poisoning (UCF)
  • S-Tune: SOT-MTJ Manufacturing Parameters Tuning for Secure Next Generation of Computing ( U. of Arizona, UCF)
  • Diffie Hellman Picture Show: Key Exchange Stories from Commercial VoWiFi Deployments (CISPA, SBA Research, U. of Vienna)

Product News

Lam Research introduced a new version of its cryogenic etch technology designed to enhance the manufacturing of 3D NAND for AI applications. This technology allows for the precise etching of high aspect ratio features, crucial for creating 1,000-layer 3D NAND.


Fig.1: 3D NAND etch. Source: Lam Research

Alphawave Semi launched its Universal Chiplet Interconnect Express Die-toDie IP. The subsystem offers 8 Tbps/mm bandwidth density and supports operation at 24 Gbps for D2D connectivity.

Infineon introduced a new MCU series for industrial and consumer motor controls, as well as power conversion system applications. The company also unveiled its new GoolGaN Drive product family of integrated single switches and half-bridges with integrated drivers.

Rambus released its DDR5 Client Clock Driver for next-gen, high-performance desktops and notebooks. The chips include Gen1 to Gen4 RCDs, power management ICs, Serial Presence Detect Hubs, and temperature sensors for leading-edge servers.

SK hynix introduced its new GDDR7 graphics DRAM. The product has an operating speed of 32Gbps, can process 1.5TB of data per second and has a 50% power efficiency improvement compared to the previous generation.

Intel launched its new Lunar Lake Ultra processors. The long awaited chips will be included in more than 80 laptop designs and has more than 40 NPU tera operations per second as well as over 60 GPU TOPS delivering more than 100 platform TOPS.

Brewer Science achieved recertification as a Certified B Corporation, reaffirming its commitment to sustainable and ethical business practices.

Panasonic adopted Siemens’ Teamcenter X cloud product lifecycle management solution, citing Teamcenter X’s Mendix low-code platform, improved operational efficiency and flexibility for its choice.

Keysight validated its 5G NR FR1 1024-QAM demodulation test cases for the first time. The 5G NR radio access technology supports eMBB and was validated on the 3GPP TS 38.521-4 test specification.


Research

In a 47-page deep-dive report, the Center for Security and Emerging Technology delved into all of the scientific breakthroughs from 1980 to present that brought EUV lithography to commercialization, including lessons learned for the next emerging technologies.

Researchers at the Paul Scherrer Institute developed a high-performance X-ray tomography technique using burst ptychography, achieving a resolution of 4nm. This method allows for non-destructive imaging of integrated circuits, providing detailed views of nanostructures in materials like silicon and metals.

MIT signed a four-year agreement with the Novo Nordisk Foundation Quantum Computing Programme at University of Copenhagen, focused on accelerating quantum computing hardware research.

MIT’s Research Laboratory of Electronics (RLE) developed a mechanically flexible wafer-scale integrated photonics fabrication platform. This enables the creation of flexible photonic circuits that maintain high performance while being bendable and stretchable. It offers significant potential for integrating photonic circuits into various flexible substrate applications in wearable technology, medical devices, and flexible electronics.

The Naval Research Lab identified a new class of semiconductor nanocrystals with bright ground-state excitons, emphasizing an important advancement in optoelectronics.

Researchers from National University of Singapore developed a novel method, known as tension-driven CHARM3D,  to fabricate 3D self-healing circuits, enabling the 3D printing of free-standing metallic structures without the need for support materials and external pressure.

Find more research in our Technical Papers library.


Events and Further Reading

Find upcoming chip industry events here, including:

Event Date Location
Atomic Layer Deposition (ALD 2024) Aug 4 – 7 Helsinki
Flash Memory Summit Aug 6 – 8 Santa Clara, CA
USENIX Security Symposium Aug 14 – 16 Philadelphia, PA
SPIE Optics + Photonics 2024 Aug 18 – 22 San Diego, CA
Cadence Cloud Tech Day Aug 20 San Jose, CA
Hot Chips 2024 Aug 25- 27 Stanford University/ Hybrid
Optica Online Industry Meeting: PIC Manufacturing, Packaging and Testing (imec) Aug 27 Online
SEMICON Taiwan Sep 4 -6 Taipei
DVCON Taiwan Sep 10 – 11 Hsinchu
AI HW and Edge AI Summit Sep 9 – 12 San Jose, CA
GSA Executive Forum Sep 26 Menlo Park, CA
SPIE Photomask Technology + EUVL Sep 29 – Oct 3 Monterey, CA
Strategic Materials Conference: SMC 2024 Sep 30 – Oct 2 San Jose, CA
Find All Upcoming Events Here

Upcoming webinars are here, including topics such as quantum safe cryptography, analytics for high-volume manufacturing, and mastering EMC simulations for electronic design.

Find Semiconductor Engineering’s latest newsletters here:

Automotive, Security and Pervasive Computing
Systems and Design
Low Power-High Performance
Test, Measurement and Analytics
Manufacturing, Packaging and Materials

 

The post Chip Industry Week in Review appeared first on Semiconductor Engineering.

RISC-V Heralds New Era Of Cooperation

RISC-V is paving the way for open source to become accepted within the hardware community, creating a level of industry collaboration never seen in the past, while revitalizing the connection between academia and industry.

The big question is whether this arrangement is just a placeholder while the industry re-learns how to develop processors, or whether this processor architecture is something very different. In either case, there is a clear and pressing need for more flexible processor architectures, and at least for now, RISC-V has filled a void.

“RISC-V was born out of academia and has had strong collaboration within universities from day one,” says Loren Hobbs, vice president of product and business development at Bluespec. “This collaboration continues today, with many of the most popular open-source RISC-V processors having come from universities. Organizations such as OpenHW Group and CHIPS Alliance serve a central and critical role in driving the collaboration, which is bi-directional between the academic community and industry.”

Collaboration of this type has not existed with the industrial community in the past. “We are learning from each other,” says Florian Wohlrab, CEO at OpenHW. “We are learning best practices for verification. At the same time, we are learning what things to avoid. It is growing where people say, ‘Yes, I really get benefit from sharing ideas.'”

The need for processor flexibility exists within industry as well as academia. “There is a need within the industry for diversification on the processor front,” says Neil Hand, director of marketing at Siemens EDA. “In the past, this led to a fragmented set of companies that couldn’t work together. They didn’t see the value of working together. But RISC-V has a cohesive central organization where anyone who wants to get into the processor space can collaborate. They don’t have to expose their secret sauce, but they get to benefit from each other. A rising tide lifts all boats, and that’s really the situation we’re at with RISC-V.”

Longevity
Whether the industry can build upon this success, or whether it fizzles out over time, remains to be seen. But at least for now, RISC-V’s momentum is growing. “We are at the beginning of a revolution in hardware design,” says OpenHW’s Wohlrab. “We saw the same thing for software when Linux came out 20 or so years ago. No one was really thinking about sharing software or collaboratively developing software. There were some small open-source ventures, but working together on a big project took a long time to develop. Now we are all sharing software, all co-working. But for hardware, we’re just at the beginning of this new concept, and a lot of people need to understand that we can do the same for hardware as we did for software.”

Underlying RISC-V’s success is widespread collaboration. “One of the pillars sustaining the success of RISC-V is customization that works with the ecosystem and leverages a well-defined process,” says Sergio Marchese, vice president of application engineering at SmartDV. “RISC-V vendors face the challenge of showing how their processor customization capabilities serve an application and demonstrating the complete process on real hardware. Without strategic partnerships, RISC-V vendors must walk a much more challenging, time-consuming, and resource-intensive road.”

That framework is what makes it unique. “RISC-V has formed this framework for collaboration, and it fixes everything,” says Siemens’ Hand. “Now, when a university has a really cool idea for memory tagging in a processor design, they don’t have to build the compilers, they don’t have to build the reference platform. They already exist. Maybe a compiler optimization startup has this great idea for handling code optimization. They don’t have to build the rest of the ecosystem. When a processor IP company has this great idea, they can become focused within this bigger picture. That’s the unique nature of it. It’s not just a processor specification.”

Historically, one of the problems associated with open-source hardware was quality, because finding bugs in silicon is expensive. OpenHW is an important piece of the puzzle. “Why should everyone reinvent the wheel by themselves?” asks Wohlrab. “Why can’t we get the basic building blocks, some basic chips, take some design from academia, which has reasonably good quality, and build on them, verify them together. We are verifying with different tools, making sure we get a high coverage, and then everyone can go off and use them in their own chips for mass production, for volume shipment.”

This benefits companies both large and small. “There are several processor vendors that have switched to RISC-V,” says Hand. “Synopsys has moved to RISC-V. Andes has moved to RISC-V. MIPS has moved to RISC-V. Why? Because they can leverage the whole ecosystem. The downside of it is commoditization, which as a customer is really beneficial because you can delay choosing a processor till later in the design flow. Your early decision is to use the Arm ecosystem or RISC-V, and then you can work through it. That creates an interesting set of dynamics. You can start to create new opportunities for companies that develop and deliver IP, because you can benchmark them, swap them in and out, and see which one works. On the flip side, it makes it awful from a lock-in perspective once you’re in that socket.”

Fragmentation
Of course, there will be some friction in the system. “In the early days of RISC-V there was nearly a 1:1 balance between contributors and consumers of the technology,” says Geir Eide, director, product management for Siemens EDA. “Today there are thousands of RISC-V consumers, but only a small percentage of those will be contributors. There is a risk that there will be a disconnect between them. If, for instance, a particular market or regional segment is growing at a higher pace than others, or other market segments and regions are more conservative, they tend to stick to established solutions longer. That increases the risk that it could lead to fragmentation.”

Is that likely to impact development long term? “We do not believe that RISC-V will become regionally concentrated, although though there may be regional concentrations of focus within the broad set of implementation choices provided by RISC-V,” says Bluespec’s Hobbs. “A prime example of this is the Barcelona Supercomputer Center, creating a regional focus area for high-performance computing using RISC-V. However, while there may be regional focus areas, this does not mean that the RISC-V standard is, or will become, fragmented. In fact, one of the key tenets of the creation and foundation of RISC-V was preventing fragmentation of the ISA, and it continues to be a key function of RISC-V international.”

China may be a different story. “A lot of companies in China are creating RISC-V cores for internal consumption — for political reasons mostly,” says John Min, vice president of customer service at Arteris. “I think China will go 100% RISC-V for embedded, but it’s a one-way street. They will keep leveraging what the Western companies do and enhance it. China will continue sucking all advancements, such as vectorization, or the special domain-specific acceleration enhancements. They will create their own and make it their own internally, but they will give nothing back.”

Such splits have occurred in the past. “Design languages are the most recent example of that,” says Hand. “There was a regional split, and you had Europe focus on VHDL while America went with Verilog. With RISC-V, there will be that regional split where people will go off and do their things regionally. Europe has focused projects, India has theirs, but they’re still doing it within this framework. It’s this realization that everyone benefits. They’re not doing it to benefit the other people. They’re doing it ultimately to save themselves effort, to save themselves cost, but they realize that by doing it in that framework it is a net benefit to everyone.”

Bi-directionality
An important element is that everyone benefits, and that has to stretch across the academic/commercial boundary. “RISC-V has propelled a new degree of collaboration between academia and commercial organizations,” says Dave Kelf, CEO at Breker. “It’s noticeable that institutions such as Harvey Mudd College in Claremont, California, and ETH in Zurich, Switzerland, to name two, have produced advanced processor designs as a teaching aid, and have collaborated with multiple companies on their verification and design. This has been further advanced by OpenHW Group, which has made these designs accessible to the industry. This bi-directional collaboration benefits the tool providers to further enhance their offerings working on advanced, open devices, while also enabling academia to improve their designs to a commercial quality level. The virtuous circle created is essential if we are to see RISC-V established as a mainstream, industry-wide capability.”

Academia has a lot to offer in hardware advancement. “Researchers in universities are developing innovative new software and hardware to push the limits of RISC-V innovation,” says Dave Miller, head of corporate communications at SiFive. “Many of the RISC-V projects in academia are focused on optimizing performance and energy efficiency for AI workloads, and are open source so the entire ecosystem can benefit. Researchers are also actively contributing to RISC-V working groups to share their knowledge and collaborate with industry participants. These working groups are split evenly between representatives from APAC, Europe, and North America, all working together towards common goals.”

In many cases, industry is willing to fund such projects. “It makes it easy to have research topics that don’t need to boil the ocean,” says Hand. “If you’re a PhD student and you have a great idea, you can go do it. It’s easy for an industry partner to say, ‘I’ll sponsor that. That’s an interesting thing, and I am not required to allocate ridiculous amounts of money into an open-ended project. It’s like I can see the connection of how that research will go into a commercial product later.'”

This feeds back into academia. “The academics have been jumping on board with OpenHW,” says Wohlrab. “By taking their cores and productizing them, they get a chip back that could be shipped in high volume. Then they can do their research on a real commercial product and can see if their idea would fly in real life. They get real numbers and can see real figures for the benefits of a new branch predictor.”

It can also have a long-term benefit for tools. “There are areas where they want to collaborate with us, especially around security,” says Kiran Vittal, executive director for alliances marketing management at Synopsys. “They are building RISC-V based sub-systems using open-source RISC-V processors, and then academia wants to look at not only the AI part, but the security part. There are post-doc students or PhD students looking into using our tools to verify or to implement whatever they’re doing on security.”

That provides an incentive for EDA to offer better tools for use in universities. “Although there has always been collaboration between universities and the industry, where industry provides the universities with access to EDA tools, IP cores, etc., there’s often a bit of a lag,” says Siemens’ Eide. “In many situations (especially outside of the core area of a particular project), universities have access to older versions of the commercial solutions. If you for instance look at a new grad’s resume, where you in the past would see references to old tech, now you see a lot of references to relatively sophisticated use of RISC-V.”

Moving Forward
This collaboration needs to keep pushing forward. “We had an initiative to create a standardized interface for accelerators,” says Wohlrab. “RISC-V International standardized how to add custom instructions in the ISA, but there was no standard for the hardware interface. So we built this. It was a cool discussion. There were people from Silicon Labs, people from NXP, people from Thales, plus several startups. They all came together and asked, ‘How can we make it future proof and put the accelerators inside?'”

The application space for RISC-V is changing. “The big inflection point is Linux and Android,” says Arteris’ Min. “Android already has some support, but when both Android and Linux are really supported, it will change the mobile apps processor game. The number of designs will proliferate. The number of high-end designs will explode. It will take the whole industry to enable that because RISC-V companies are not big enough to create this by themselves. All the RISC-V companies are partners, because we enable this high-end design at the processor levels.”

That would deepen the software community’s engagement. “An embedded software developer needs to understand the underlying hardware if they want to run Linux on a RISC-V processor that uses custom instructions/ accelerators,” says Bluespec’s Hobbs. “To develop complex embedded hardware/software systems, both embedded software developers and embedded hardware developments must possess contextual understanding of the interoperability of hardware and software. The developer must understand how the customized processor is leveraging the custom instructions in hardware for Linux to efficiently manage and execute the accelerated workloads.”

This collaboration could reinvigorate research into EDA, as well. “With AI you can build predictive models,” says Hand. “Could that be used to identify the change effects from making an extension? What does that mean? There’s a cloud of influence — not directly gate-wise, because that immediately explodes — but perhaps based on test suites. ‘I know that something that touches that logic touches this downstream, which touches the rest of the design.’ That’s where AI plays a big role, and it is one of the interesting areas because in verification there are so many unknowns. When AI comes along, any guidance or any visibility that you can give is incredibly powerful. Even if it is not right 100% of the time, that’s okay, as long as it generates false negatives and not false positives.”

There is a great opportunity for EDA companies. “We collaborate with many of the open-source providers, with OpenHW group, with ETH in Zurich,” says Synopsys’ Vittal. “We want to promote our solutions when it comes to any processor design and you need standard tools like synthesis, place and route, simulation. But then there are also other kinds of unique solutions because RISC-V is so customizable, you can build your own custom instructions. You need something specific to verify these custom instructions and that’s why the Imperas golden models are important. We also collaborated with Bluespec to develop a verification methodology to take you through functional verification and debug.”

There are still some wrinkles to be worked out for customizations. “RISC-V gives us predictability,” says Hand. “We can create a compliance test suite, give you a processor optimization package if you’re on the implementation side. We can create analytics and testing solutions because we know what it’s going to look like. But for non-standard processors, it is effectively as a service, because everyone’s processor is a little bit different. The reason you see a lot of focus on the verification, from platform architecture exploration all the way through, is because if you change one little thing, such as an addressing mode, it impacts pretty much 100% of your processor verification. You’ve got to retest the whole processor. Most people aren’t set up like an Arm or an Intel with huge processor verification teams and the infrastructure, and so they need automation to do it for them.”

Conclusion
RISC-V has enabled the industry to create a framework for collaboration, which enables everyone to work together for selfish reasons. It is a symbiotic relationship that continues to build, and it is creating a wider sphere of influence over time.

“It’s unique in the modern era of semiconductor,” says Hand. “You have such a wide degree of collaboration, where you have processor manufacturers, the software industry leaders, EDA companies, all working on a common infrastructure.”

Related Reading
RISC-V Micro-Architectural Verification
Verifying a processor is much more than making sure the instructions work, but the industry is building from a limited knowledge base and few dedicated tools.
RISC-V Wants All Your Cores
It is not enough to want to dominate the world of CPUs. RISC-V has every core in its sights, and it’s starting to take steps to get there.

The post RISC-V Heralds New Era Of Cooperation appeared first on Semiconductor Engineering.

Chip Aging Becoming Key Factor In Data Center Economics

Chip aging is becoming a much bigger concern inside of data centers, where it can impact server uptime, utilization rates, and the amount of energy needed to drive signals and cool entire server racks.

Aging in chips is the result of both higher logic utilization and increasing transistor density. This is problematic for data centers, in general, but especially for AI chips where digital logic is expected to run at maximum speed. That generates more heat, which becomes harder to dissipate as the number specialized and general-purpose processing elements per square millimeter of silicon continues to rise. Heat typically gets trapped between the fins of finFETs and gate-all-around FETs, accelerating electromigration and reducing the time it takes for dielectrics to break down. It also can cause warpage, which can rupture the bonds and contacts between different components in an advanced package or on a PCB.

For data centers, that creates a number of challenges:

  • Thermal management: This requires a deep understanding of workloads and the resulting transient thermal gradients as processing is load-balanced on-chip, between chips or chiplets, and between servers;
  • More data: Data from sensors everywhere, along with larger training sets, all need to be processed faster than in the past to keep up with the flood of data, but all of that needs to happen in the same or smaller footprint without overheating any part of a device, and
  • In-circuit monitoring: Sensors can be added into chips to detect variations in heat and data speeds in different paths, but it’s much more difficult to keep track of tens of thousands of these monitors as they collect data from heterogeneous processing elements, each of which can age at different rates depending on process variation, defectivity, varying workloads, and ambient thermal conditions.

“Servers are much more capable today than they were 10 years ago, and the issue is that power hasn’t scaled like it used to,” said Steven Woo, Rambus fellow and distinguished inventor. “Now, if you want to do lots more work in your server, you have to burn more power to do it. Twenty years ago, a server might dissipate a couple hundred watts. But with the latest servers that NVIDIA just announced around Grace Blackwell, the whole rack is 120 kilowatts, and the individual servers are many kilowatts. Just delivering power into those racks is causing changes in the infrastructure in the industry. Now that you have to bring in and dissipate more power in a small space, you get all kinds of interesting things that could happen over time. The heat that’s being dissipated can have effects on the chip, and you have to worry sometimes about thermal cycling where, as the chip is doing a lot of work, maybe part of the chip stops and then it does more work. You get these rapid cycles of dissipating a lot of power, then not, then dissipating a lot of power, then not. That cycling causes local heating and cooling, leading to thermal stresses, and this impacts all chips, including memory.”

As a result, everyone from the data center manager to the chip architect now has to understand how a chip behaves in the field, and how increasingly customized chip and system architectures will function over time. Downtime is costly for a data center, but under-utilization and reduced performance also carries a high price tag. That, in turn, affects how much margin is considered essential, such as extra data paths if some of them are fully or partially closed off by electromigration, and how that margin will impact performance, power, and area/cost over a chip’s projected lifetime — especially in a heterogeneous design with specialized compute elements.

“When it comes to the hyper-scalers and high powered, highly customized, heterogeneous chips for various different workloads, these chips are on 24/7, so consistent uptime is critical,” said Dan Lee, product management director at Cadence. “Since all of these chips are done at the really advanced nodes, with the smaller device sizes, more developers are looking to do aging analysis, and derive the wear and tear so they can see if the chip is going to last a year or five years. At the same time, an important consideration is also thermal — especially when we’re talking about these heterogeneous integrations, and you don’t really get the thermal conductivity that you would in a straightforward, monolithic design. There’s a bit more thought or planning that needs to be a part of this because aging and heating are related. All things being equal, if you’re operating in a very hot environment, you’re going to expect a lower lifespan.”

Still, determining how much shorter that lifespan will be isn’t always a precise calculation. “Data center SoCs that execute mission-critical workloads need to provide scalable visibility, predict problems before they occur, provide deep-dive analyses into problems, and be optimized to increase longevity of investment,” said Padmakumar Karthik, senior technology manager at Arm. “Data center diagnostic patterns are often deployed to measure the health of an SoC post-manufacturing to prevent silent data corruption (SDC) issues. But on-chip sensors provide an additional layer of insights, detecting droops or aging or thermal events on-chip, all of which can cause SDC incidents. For this reason, scalable, customizable sensor frameworks that can monitor and adapt throughout the useful life of the device, enabling continuous design optimization and preventive maintenance, will be increasingly important.”

There are multiple ways to achieve this, but each data center can be very different. In some cases, chips are designed by systems companies for internal use. And in most cases, there is a mix of different hardware and software, not all of which is state-of-the-art. “Many data centers have legacy infrastructure that may not be inherently designed for optimal power efficiency,” noted Noam Brousard, vice president of systems at proteanTecs, in a recent blog. “Upgrading or retrofitting such infrastructure poses challenges in achieving comprehensive power optimization.”

Even within a single rack, stresses can vary greatly from one server to the next, and from one chip to the next even in the same server. “You can imagine when you have a very big chip, toward the edges of the chip it will expand more than in a small chip, and that can add stress,” said Rambus’ Woo. “You have to really be careful about how you cool things, and memory is no different. You have very specific things you worry about with memory, like the ability to retain data, depending on how hot the chip is.”

In addition, as chips age, parameters drift. Marc Swinnen, director of product marketing in Ansys’ semiconductor division, said the traditional approach has been to use a library that’s characterized as a brand new chip. “The library is characterized at 1 year, 5 years, 10 years, 15 years, and you can run all your analysis multiple times with these different aged libraries. That sounds good on paper, and that’s what a lot of people do, but the problem is that not all parts of the chip age at the same rate. This is why aging is often associated with activity and temperature. Some parts of the chip are more active and hotter than other parts of the chip, so the aging time runs differently for different parts. This means you want to apply some of the old library to some parts of the chip, and the younger library to other parts of the chip, because if signals run between them you have setup and hold issues. If everything slows down at the same time — or one slows down and the other one doesn’t — you’re going to get mismatches, and that’s the difficulty. At the bottom level, it’s easy. Every gate is assigned its right age. That’s simple. You do an analysis with every gate. But how do you assign the age to every gate? Where do you get that information from? You need a lot of realistic activity, and then predict that over the lifespan and with temperature. That’s the problem. How do you actually construct this aging map? Once you have it, the analysis is not that hard.”

Aging maps are application- and workload-specific. Every chip will age differently depending on the functions it performs.

But aging is just one of many factors that affect data center uptime. “When we look at data center, we look at the whole application first, then whittle it down to what that means for chips and packages,” said Kelly Morgan, senior principal application engineer at Ansys. “From the mechanical reliability lens of the data center operation, we go through thermal cycling, obviously. We’re in a controlled environment. But what does that influence? How does that influence the integrity of the chips as you go through thermal cycles? Typically, we’ll look at things like solder fatigue and other effects.”

Another factor to consider is shipping and handling, which can affect the aging of a chip, package, and board.

“Even before the device is put in place, there are opportunities for vibration,” Morgan said. “You might hit something, which is a bit of a shock. We have customers who are looking at things like drop, shock, and vibration, and they have goals they need to test to. Typically, the standard process is to do a lot of physical testing. Now as you can imagine, that can be pretty challenging. You have to be pretty far along in the design process before you really start to go and test, and if there’s an issue, then you’ve got to go back and retest. Early simulation helps here, especially for those larger-scale events, and that comes down to the chassis, the board, to all the components, including the ICs.”


Fig. 1: Components of complete electronic system analysis. Source: Ansys

Quality control remains a big challenge when it comes to mechanical stresses that can affect aging. Adam Cron, distinguished architect at Synopsys, pointed to a recent Intel white paper, which noted that at the current acceptable defectivity rates, one core fails every two days. To account for this, Cron noted that certain commercial tools support in-system delay testing in a BiST mode. By adding specific IP, any ATPG patterns could be added to that. (Intel’s paper said its solution only applies to stuck-at testing.)

“In very large, millions-of-cores data center-type environments, the implication is that you’d better be ready,” Cron said. “One of the things they were talking about in this paper was in-system scan. Intel was bringing a database of test patterns in, and then applying it in-system after isolating a core. And then, upon a failure, they’d quarantine and move on. But the data centers are apparently running out of that opportunistic time slot to do any of this. We’ve heard some interesting conversations about the fact that people do run a lot of things during certain times. However, other times are cheaper, so all the holes are just getting filled in terms of runtime. Monitors are certainly something to look at, but monitors are looking at systemic degradation. That’s known, if you will. And so as things degrade, Vmin will change, maybe frequency will change. And they’ll be on a pace. They can figure out when to do that. That’s easy enough to figure out. However, if there’s a marginality or some broken component in there, it is not up to the tool to find that. And frankly, the in-system scan wasn’t addressing all components on the die. It was only up to like 80% of stuck-at coverage, which isn’t that much, especially when you’re not looking at all of the pieces inside the die. The point is, there are still opportunities to do better.”

Cron noted that one big systems company suggested a dual-core lockstep mechanism, starting out the data center in dual-core lock-step mode for X number of months. “When it looks like you’ve squeezed the major part of the curve out, in terms of finding these defective components, then unlock them, double your capacity, run like that for a while, and periodically hook some back up again. That means everything is utilized, at least. Of course, some are working at half capacity here and there, but it’s not the whole die. And there are some implications there from a design standpoint, at least for the hardware, but also possibly the operating system, depending on who decides what physical core is used versus what virtual core is used.”

Approaches to measuring aging
Any discussion around aging circuits really boils down to extending the life of the machines in the data center, and not getting caught by surprise when failures occur.

“How do you do that? You have to measure the aging of those machines,” said Neil Hand, director of marketing, IC segment at Siemens EDA. “Right now, if you speak to the CIOs of these big companies with big data centers, they say, ‘We’ve got to get rid of the machines after three years because we can’t risk it going down.’ If you look at embedded analytics capabilities, you can start to embed aging monitors in those devices, you can start to monitor those in real time. It doesn’t look that different than what it does from an automotive perspective. It’s all the same technologies, effectively, but you’re monitoring them. And then you can say, ‘We’re now at 90% of our life for this server.’ We can then just replace that server.”

This feeds into corporate goals around sustainability, as well. “It comes down to building the best thing to begin with, then building it with design for manufacturing in mind so that you don’t get waste during manufacturing, achieve better yields, and finally extend the life of products and build them in environmentally-sustainable ways,” Hand said. “If you can extend the data center lifecycle from three years to five years, that’s big. And especially if you start going to these high-performance, application-specific type of clusters, you may not need to change them as often, because if the underlying capabilities aren’t changing, that might drive the cycling of it. In the case of a biological computer, if there’s no new change to the underlying protein folding mechanisms, you might say, ‘We don’t need a new compute platform. This is really good.”

The longer the product life can be extended, the better. Design for aging is a matter of, first, performing the aging analysis with the foundry models. “Run the simulations and observe the effects,” said Cadence’s Lee. “When you’re doing the simulation, you want to have the right mission profiles, so you come up with an accurate prediction of how your device is going to behave after a certain number of years in deployment. You may want to combine that with thermal analysis, for example, because how that aging is going to behave will depend on what temperature this design is going to be working at. You may think it’s 22 degrees Celsius, but maybe through some thermal analysis you realize it’s actually going to be operating at 35 or 40 degrees most of the time. That may change the outcome of your aging analysis.”

In terms of the associated thermal analysis, this can extend beyond a single device. “It’s also how that heat is moving,” Lee said. “Let’s say you have this integrated design, where you have some power devices alongside some logic, or some other functionality that is lower power. What you may want to understand is, if those bandgaps or power circuits are generating a lot of heat, that may be shifting over into other parts of your design. So when you run your aging analysis, you may assume that you’re running at 25 degrees, whereas the power devices are at 40 or 45 degrees. They’re on the same chip, they’re very close to each other, and you have to understand how much of that heat is moving over to your logic and what that’s going to bring the temperature up to. You want to know that so you can perform the aging analysis based on that higher temperature.”

Another consideration is combining aging analysis and interconnect parasitics, which is especially relevant for advanced nodes due to the parasitics in the interconnect. “They’re dominant when it comes to performance and functionality,” Lee added. “So when thinking about aging, you also have to think about it being an aged device that has to push the electrons through this interconnect. That’s a pretty heavy load. When you’re doing the aging analysis, you probably will have to be doing it with extracted parasitics. You just can’t do it on a pure schematic design. It doesn’t give you enough detail about what’s really happening physically. This may be included in the aging analysis tool. When most people talk about aging, they may not think about the parasitic aspect to it.”

Combating aging, thermal in memory
While standards don’t work in custom silicon, they do work for some standard components in those devices, such as memory. Over the past 10 to 15 years, memory standards have started to address the impact of heat.

“If you start to exceed certain temperature limits, you’ve got to refresh the device more frequently because the charge can leak off the cells more quickly,” said Rambus’ Woo. “So there are temperature-dependent refresh rates. There are other things that can be exacerbated, like the capacitors are getting smaller, they’re holding fewer electrons because there are so many more of them on a chip now, so we’ve seen memories adopt on-die error correction. This on-die error correction is something that is hidden from the outside world. In many cases, you don’t even know an error has occurred and been corrected on the chip. Those kinds of technologies become even more important now because the temperatures can be higher.”

There also is growing demand for more telemetry to provide monitoring information. “You just want to know if anything is overheating,” said Woo. “Does something seem like it’s malfunctioning? The data center manager will get regular updates about the status of the major components of the system. A lot of boards now in servers have baseboard management controllers (BMCs), which are little chips that sit on each board and are responsible for, among other things, reporting back the health of that board when a server might have five or six boards. We’re frequently seeing more of these BMC chips.”

Design for aging
While the goal is to be able to guarantee a certain lifetime for the chips in a data center, the challenges for achieving that are expanding. “There’s a growing list of things that can be harmful to devices over their lifetime,” Woo said. “It’s a balance between not adding too much cost, even though you have to increase the reliability and maybe add new features, and all of these things are in play with each other.”

Whether it is liquid cooling or higher levels of RAS ECC in the system, there is no single best answer for every application. In general, the industry is moving toward higher reliability and increasing resilience, but there are many ways to get there and challenges with each of them.

“Just as 15 years ago we didn’t necessarily always think we had to talk about power, now we have to talk about it all the time,” Woo said. “The same thing is going to be true for resilience and reliability. It’s going to be required to become part of the way people think about architectures, and part of that is how the memory system improves its reliability. You can’t really do anything unless you can compute on some data, and you have to make sure that data is reliable. It will touch how memory is stored in a DRAM. It will touch how memory is communicated across links. And it even will touch how processors manipulate data once they get a hold of it in their caches, and in the compute pipelines. Also, one of the key things people will worry about is how much of that susceptibility is brought about by age-related issues, like heating cycles, etc.”

Finally, there are even issues around the quality of the power that comes into a system. “The servers get noise on the power rails, and it’s a balance between how much money you’re willing to pay for the power delivery versus the quality of power,” said Woo. “You have to be tolerant of those kinds of things, too. Power management becomes more challenging, as well as the amount of power that these systems are using today. NVIDIA systems bring 48-volt power into the racks, and there is talk about even higher voltage levels. Those changes in infrastructure can all impact heat, and can age components differently.”

The post Chip Aging Becoming Key Factor In Data Center Economics appeared first on Semiconductor Engineering.

Chip Industry Week In Review

President Biden will raise the tariff rate on Chinese semiconductors from 25% to 50% by 2025, among other measures to protect U.S. businesses from China’s trade practices. Also, as part of President Biden’s AI Executive Order, the Administration released steps to protect workers from AI risks, including human oversight of systems and transparency about what systems are being used.

Intel is in advanced talks with Apollo Global Management for the equity firm to provide more than $11 billion to build a fab in Ireland, reported the Wall Street Journal. Also, Intel’s Foundry Services appointed Kevin O’Buckley as the senior vice president and general manager.

Polar is slated to receive up to $120 million in CHIPS Act funding to establish an independent American foundry in Minnesota. The company expects to invest about $525 million in the expansion of the facility over the next two years, with a $75 million investment from the State of Minnesota.

Arm plans to develop AI chips for launch next year, reports Nikkei Asia.

South Korea is planning a support package worth more than 10 trillion won ($7.3 billion) aimed at chip materials, equipment makers, and fabless companies throughout the semiconductor supply chain, according to Reuters.

Quick links to more news:

Global
In-Depth
Markets and Money
Security
Supercomputing
Education and Training
Product News
Research
Events and Further Reading


Global

Edwards opened a new facility in Asan City, South Korea. The 15,000m² factory provides a key production site for abatement systems, and integrated vacuum and abatement systems for semiconductor manufacturing.

France’s courtship with mega-tech is paying off.  Microsoft is investing more than US $4 billion to expand its cloud computing and AI infrastructure, including bringing up to 25,000 advanced GPUs to the country by the end of 2025. The “Choose France” campaign also snagged US $1.3 billion from Amazon for cloud infrastructure expansion, genAI and more.

Toyota, Nissan, and Honda are teaming up on AI and chips for next-gen cars with support from Japan’s Ministry of Economy, Trade and Industry, (METI), reports Nikkei Asia.

Meanwhile, IBM and Honda are collaborating on long-term R&D of next-gen technologies for software-defined vehicles (SDV), including chiplets, brain-inspired computing, and hardware-software co-optimization.

Siemens and Foxconn plan to collaborate on global manufacturing processes in electronics, information and communications technology, and electric vehicles (EV).

TSMC confirmed a Q424 construction start date for its first European plant in Dresden, Germany.

Amazon Web Services (AWS) plans to invest €7.8 billion (~$8.4B) in the AWS European Sovereign Cloud in Germany through 2040. The system is designed to serve public sector organizations and customers in highly regulated industries.


In-Depth

Semiconductor Engineering published its Low Power-High Performance newsletter this week, featuring these stories:

And this week’s Test, Measurement & Analytics newsletter featured these stories:


Markets and Money

The U.S. National Institute of Standards and Technology (NIST) awarded more than $1.2 million to 12 businesses in 8 states under the Small Business Innovation Research (SBIR) Program to fund R&D of products relating to cybersecurity, quantum computing, health care, semiconductor manufacturing, and other critical areas.

Engineering services and consulting company Infosys completed the acquisition of InSemi Technology, a provider of semiconductor design and embedded software development services.

The quantum market, which includes quantum networking and sensors alongside computing, is predicted to grow from $838 million in 2024 to $1.8 billion in 2029, reports Yole.

Shipments of OLED monitors reached about 200,000 units in Q1 2024, a year over year growth of 121%, reports TrendForce.

Global EV sales grew 18% in Q1 2024 with plug-in hybrid electric vehicles (PHEV) sales seeing 46% YoY growth and battery electric vehicle (BEV) sales growing just 7%, according to Counterpoint. China leads global EV sales with 28% YoY growth, while the US grew just 2%. Tesla saw a 9% YoY drop, but topped BEV sales with a 19% market share. BYD grew 13% YoY and exported about 100,000 EVs with 152% YoY growth, mainly in Southeast Asia.

DeepX raised $80.5 million in Series C funding for its on-device NPU IP and AI SoCs tailored for applications including physical security, robotics, and mobility.

MetisX raised $44 million in Series A funding for its memory solutions built on Compute Express Link (CXL) for accelerating large-scale data processing applications.


Security

While security experts have been warning of a growing threat in electronics for decades, there have been several recent fundamental changes that elevate the risk.

Synopsys and the Ponemon Institute released a report showing 54% of surveyed organizations suffered a software supply chain attack in the past year and 20% were not effective in their response. And 52% said their development teams use AI tools to generate code, but only 32% have processes to evaluate it for license, security, and quality risks.

Researchers at Ruhr University Bochum and TU Darmstadt presented a solution for the automated generation of fault-resistant circuits (AGEFA) and assessed the security of examples generated by AGEFA against side-channel analysis and fault injection.

TXOne reported on operational technology security and the most effective method for preventing production interruptions caused by cyber-attacks.

CrowdStrike and NVIDIA are collaborating to accelerate the use of analytics and AI in cybersecurity to help security teams combat modern cyberattacks, including AI-powered threats.

The National Institute of Standards and Technology (NIST) finalized its guidelines for protecting sensitive data, known as controlled unclassified information, aimed at organizations that do business with the federal government.

The Defense Advanced Research Projects Agency (DARPA) awarded BAE Systems a $12 million contract to solve thermal challenges limiting electronic warfare systems, particularly in GaN transistors.

Sigma Defense won a $4.7 million contract from the U.S. Army for an AI-powered virtual training environment, partnering with Brightline Interactive on a system that uses spatial computing and augmented intelligence workflows.

SkyWater’s advanced packaging operation in Florida has been accredited as a Category 1A Trusted Supplier by the Defense Microelectronics Activity (DMEA) of the U.S. Department of Defense (DoD).

Videos of two CWE-focused sessions from CVE/FIRST VulnCon 2024 were made available on the CWE YouTube Channel.

The Cybersecurity and Infrastructure Security Agency (CISA) issued a number of alerts/advisories.


Supercomputing

Supercomputers are battling for top dog.

The Frontier supercomputer at Oak Ridge National Laboratory (ORNL) retained the top spot on the Top500 list of the world’s fastest systems with an HPL score of 1.206 EFlop/s. The as-yet incomplete Aurora system at Argonne took second place, becoming the world’s second exascale system at 1.012 EFlop/s. The Green500 list, which tracks energy efficiency of compute, saw three new entrants take the top places.

Cerebras Systems, Sandia National Laboratory, Lawrence Livermore National Laboratory, and Los Alamos National Laboratory used Cerebras’ second generation Wafer Scale Engine to perform atomic scale molecular dynamics simulations at the millisecond scale, which they claim is 179X faster than the Frontier supercomputer.

UT Austin‘s Stampede3 Supercomputer is now in full production, serving the open science community through 2029.


Education and Training

SEMI announced the SEMI University Semiconductor Certification Programs to help alleviate the workforce skills gap. Its first two online courses are designed for new talent seeking careers in the industry, and experienced workers looking to keep their skills current.  Also, SEMI and other partners launched a European Chip Skills Academy Summer School in Italy.

Siemens created an industry credential program for engineering students that supplements a formal degree by validating industry knowledge and skills. Nonprofit agency ABET will provide accreditation. The first two courses are live at the University of Colorado Boulder (CU Boulder) and a series is planned with Pennsylvania State University (Penn State).

Syracuse University launched a $20 million Center for Advanced Semiconductor Manufacturing, with co-funding from Onondaga County.

Starting young is a good thing.  An Arizona school district, along with the University Of Arizona,  is creating a semiconductor program for high schoolers.


Product News

Siemens and Sony partnered to enable immersive engineering via a spatial content creation system, NX Immersive Designer, which includes Sony’s XR head-mounted display. The integration of hardware and software gives designers and engineers natural ways to interact with a digital twin. Siemens also extended its Xcelerator as a Service portfolio with solutions for product engineering and lifecycle management, cloud-based high-performance simulation, and manufacturing operations management. It will be available on Microsoft Azure, as well.

Advantest announced the newest addition to its portfolio of power supplies for the V93000 EXA Scale SoC test platform. The DC Scale XHC32 power supply offers 32 channels with single-instrument total current of up to 640A.

Fig. 1: Advantest’s DC Scale XHC32. Source: Advantest

Infineon released its XENSIV TLE49SR angle sensors, which can withstand stray magnetic fields of up to 8 mT, ideal for applications of safety-critical automotive chassis systems.

Google debuted its sixth generation Cloud TPU, 4.7X faster and 67% more energy-efficient than the previous generation, with double the high-bandwidth memory.

X-Silicon uncorked a RISC-V vector CPU, coupled with a Vulkan-enabled GPU ISA and AI/ML acceleration in a single processor core, aimed at embedded and IoT applications.

IBM expanded its Qiskit quantum software stack, including the stable release of its SDK for building, optimizing, and visualizing quantum circuits.

Northeastern University announced the general availability of testing and integration solutions for Open RAN through the Open6G Open Testing and Integration Center (Open 6G OTIC).


Research

The University of Glasgow received £3 million (~$3.8M) from the Engineering and Physical Sciences Research Council (EPSRC)’s Strategic Equipment Grant scheme to help establish “Analogue,” an Automated Nano Analysing, Characterisation and Additive Packaging Suite to research silicon chip integration and packaging.

EPFL researchers developed scalable photonic ICs, based on lithium tantalate.

DISCO developed a way to increase the diameter of diamond wafers that uses the KABRA process, a laser ingot slicing method.

CEA-Leti developed two complementary approaches for high performance photon detectors — a mercury cadmium telluride-based avalanche photodetector and a superconducting single photon detector.

Toshiba demonstrated storage capacities of over 30TB with two next-gen large capacity recording technologies for hard disk drives (HDDs): Heat Assisted Magnetic Recording (HAMR) and Microwave Assisted Magnetic Recording (MAMR).

Caltech neuroscientists reported that their brain-machine interface (BMI) worked successfully in a second human patient, following 2022’s first instance, proving the device is not dependent on one particular brain or one location in a brain.

Linköping University researchers developed a cheap, sustainable battery made from zinc and lignin, while ORNL researchers developed carbon-capture batteries.


Events and Further Reading

Find upcoming chip industry events here, including:

Event Date Location
European Test Symposium May 20 – 24 The Hague, Netherlands
NI Connect Austin 2024 May 20 – 22 Austin, Texas
ITF World 2024 (imec) May 21 – 22 Antwerp, Belgium
Embedded Vision Summit May 21 – 23 Santa Clara, CA
ASIP Virtual Seminar 2024 May 22 Online
Electronic Components and Technology Conference (ECTC) 2024 May 28 – 31 Denver, Colorado
Hardwear.io Security Trainings and Conference USA 2024 May 28 – Jun 1 Santa Clara, CA
SW Test Jun 3 – 5 Carlsbad, CA
IITC2024: Interconnect Technology Conference Jun 3 – 6 San Jose, CA
VOICE Developer Conference Jun 3 – 5 La Jolla, CA
CHIPS R&D Standardization Readiness Level Workshop Jun 4 – 5 Online and Boulder, CO
Find All Upcoming Events Here

Upcoming webinars are here.


Semiconductor Engineering’s latest newsletters:

Automotive, Security and Pervasive Computing
Systems and Design
Low Power-High Performance
Test, Measurement and Analytics
Manufacturing, Packaging and Materials

 

The post Chip Industry Week In Review appeared first on Semiconductor Engineering.

Will Domain-Specific ICs Become Ubiquitous?

Questions are surfacing for all types of design, ranging from small microcontrollers to leading-edge chips, over whether domain-specific design will become ubiquitous, or whether it will fall into the historic pattern of customization first, followed by lower-cost, general-purpose components.

Custom hardware always has been a double-edged sword. It can provide a competitive edge for chipmakers, but often requires more time to design, verify, and manufacture a chip, which can sometimes cost a market window. In addition, it’s often too expensive for all but the most price-resilient applications. This is a well-understood equation at the leading edge of design, particularly where new technologies such as generative AI are involved.

But with planar scaling coming to an end, and with more features tailored to specific domains, the chip industry is struggling to figure out whether the business/technical equation is undergoing a fundamental and more permanent change. This is muddied further by the fact that some 30% to 35% of all design tools today are being sold to large systems companies for chips that will never be sold commercially. In those applications, the collective savings from improved performance per watt may dwarf the cost of designing, verifying, and manufacturing a highly optimized multi-chip/multi-chiplet package across a large data center, leaving the debate about custom vs. general-purpose more uncertain than ever.

“If you go high enough in the engineering organization, you’re going to find that what people really want to do is a software-defined whatever it is,” says Russell Klein, program director for high-level synthesis at Siemens EDA. “What they really want to do is buy off-the-shelf hardware, put some software on it, make that their value-add, and ship that. That paradigm is breaking down in a number of domains. It is breaking down where we need either extremely high performance, or we need extreme efficiency. If we need higher performance than we can get from that off-the-shelf system, or we need greater efficiency, we need the battery to last longer, or we just can’t burn as much power, then we’ve got to start customizing the hardware.”

Even the selection of processing units can make a solution custom. “Domain-specific computing is already ubiquitous,” says Dave Fick, CEO and cofounder of Mythic. “Modern computers, whether in a laptop, phone, security camera, or in farm equipment, consist of a mix of hardware blocks co-optimized with software. For instance, it is common for a computer to have video encode or decode hardware units to allow a system to connect to a camera efficiently. It is common to have accelerators for encryption so that we can safely communicate. Each of these is co-optimized with software algorithms to make commonly used functions highly efficient and flexible.”

Steve Roddy, chief marketing officer at Quadric, agrees. “Heterogeneous processing in SoCs has been de rigueur in the vast majority of consumer applications for the past two decades or more.  SoCs for mobile phones, tablets, televisions, and automotive applications have long been required to meet a grueling combination of high-performance plus low-cost requirements, which has led to the proliferation of function-specific processors found in those systems today.  Even low-cost SoCs for mobile phones today have CPUs for running Android, complex GPUs to paint the display screen, audio DSPs for offloading audio playback in a low-power mode, video DSPs paired with NPUs in the camera subsystem to improve image capture (stabilization, filters, enhancement), baseband DSPs — often with attached NPUs — for high speed communications channel processing in the Wi-Fi and 5G subsystems, sensor hub fusion DSPs, and even power-management processors that maximize battery life.”

It helps to separate what you call general-purpose and what is application-specific. “There is so much benefit to be had from running your software on dedicated hardware, what we call bespoke silicon, because it gives you an advantage over your competitors,” says Marc Swinnen, director of product marketing in Ansys’ Semiconductor Division. “Your software runs faster, lower power, and is designed to run specifically what you want to run. It’s hard for a competitor with off-the-shelf hardware to compete with you. Silicon has become so central to the business value, the business model, of many companies that it has become important to have that optimized.”

There is a balance, however. “If there is any cost justification in terms of return on investment and deployment costs, power costs, thermal costs, cooling costs, then it always makes sense to build a custom ASIC,” says Sharad Chole, chief scientist and co-founder of Expedera. “We saw that for cryptocurrency, we see that right now for AI. We saw that for edge computing, which requires extremely ultra-low power sensors and ultra-low power processes. But there also has been a push for general-purpose computing hardware, because then you can easily make the applications more abstract and scalable.”

Part of the seeming conflict is due to the scope of specificity. “When you look at the architecture, it’s really the scope that determines the application specificity,” says Frank Schirrmeister, vice president of solutions and business development at Arteris. “Domain-specific computing is ubiquitous now. The important part is the constant moving up of the domain specificity to something more complex — from the original IP, to configurable IP, to subsystems that are configurable.”

In the past, it has been driven more by economics. “There’s an ebb and a flow to it,” says Paul Karazuba, vice president of marketing at Expedera. “There’s an ebb and a flow to putting everything into a processor. There’s an ebb and a flow to having co-processors, augmenting functions that are inside of that main processor. It’s a natural evolution of pretty much everything. It may not necessarily be cheaper to design your own silicon, but it may be more expensive in the long run to not design your own silicon.”

An attempt to formalize that ebb and flow was made by Tsugio Makimoto in the 1990s, when he was Sony’s CTO. He observed that electronics cycled between custom solutions and programmable ones approximately every 10 years. What’s changed is that most custom chips from the time of his observation contained highly programmable standard components.

Technology drivers
Today, it would appear that technical issues will decide this. “The industry has managed to work around power issues and push up the thermal envelope beyond points I personally thought were going to be reasonable, or feasible,” says Elad Alon, co-founder and CEO of Blue Cheetah. “We’re hitting that power limit, and when you hit the power limit it drives you toward customization wherever you can do it. But obviously, there is tension between flexibility, scalability, and applicability to the broadest market possible. This is seen in the fast pace of innovation in the AI software world, where tomorrow there could be an entirely different algorithm, and that throws out almost all the customizations one may have done.”

The slowing of Moore’s Law will have a fundamental influence on the balance point. “There have been a number of bespoke silicon companies in the past that were successful for a short period of time, but then failed,” says Ansys’ Swinnen. “They had made some kind of advance, be it architectural or addressing a new market need, but then the general-purpose chips caught up. That is because there’s so much investment in them, and there’s so many people using them, there’s an entire army of people advancing, versus your company, just your team, that’s advancing your bespoke solution. Inevitably, sooner or later, they bypass you and the general-purpose hardware just gets better than the specific one. Right now, the pendulum has swung toward custom solutions being the winner.”

However, general-purpose processors do not automatically advance if companies don’t keep up with adoption of the latest nodes, and that leads to even more opportunities. “When adding accelerators to a general-purpose processor starts to break down, because you want to go faster or become more efficient, you start to create truly customized implementations,” says Siemens’ Klein. “That’s where high-level synthesis starts to become really interesting, because you’ve got that software-defined implementation as your starting point. We can take it through high-level synthesis (HLS) and build an accelerator that’s going to do that one specific thing. We could leave a bunch of registers to define its behavior, or we can just hard code everything. The less general that system is, the more specific it is, usually the higher performance and the greater efficiency that we’re going to take away from it. And it almost always is going to be able to beat a general-purpose accelerator or certainly a general-purpose processor in terms of both performance and efficiency.”

At the same time, IP has become massively configurable. “There used to be IP as the building blocks,” says Arteris’ Schirrmeister. “Since then, the industry has produced much larger and more complex IP that takes on the role of sub-systems, and that’s where scope comes in. We have seen Arm with what they call the compute sub-systems (CSS), which are an integration and then hardened. People care about the chip as a whole, and then the chip and the system context with all that software. Application specificity has become ubiquitous in the IP space. You either build hard cores, you use a configurable core, or you use high-level synthesis. All of them are, by definition, application-specific, and the configurability plays in there.”

Put in perspective, there is more than one way to build a device, and an increasing number of options for getting it done. “There’s a really large market for specialized computing around some algorithm,” says Klein. “IP for that is going to be both in the form of discrete chips, as well as IP that could be built into something. Ultimately, that has to become silicon. It’s got to be hardened to some degree. They can set some parameters and bake it into somebody’s design. Consider an Arm processor. I can configure how many CPUs I want, I can configure how big I want the caches, and then I can go bake that into a specific implementation. That’s going to be the thing that I build, and it’s going to be more targeted. It will have better efficiency and a better cost profile and a better power profile for the thing that I’m doing. Somebody else can take it and configure it a little bit differently. And to the degree that the IP works, that’s a great solution. But there will always be algorithms that don’t have a big enough market for IP to address. And that’s where you go in and do the extreme customization.”

Chiplets
Some have questioned if the emerging chiplet industry will reverse this trend. “We will continue to see systems composed of many hardware accelerator blocks, and advanced silicon integration technologies (i.e., 3D stacking and chiplets) will make that even easier,” says Mythic’s Fick. “There are many companies working on open standards for chiplets, enabling communication bandwidth and energy efficiency that is an order of magnitude greater than what can be built on a PCB. Perhaps soon, the advanced system-in-package will overtake the PCB as the way systems are designed.”

Chiplets are not likely to be highly configurable. “Configuration in the chiplet world might become just a function of switching off things you don’t need,” says Schirrmeister. “Configuration really means that you do not use certain things. You don’t get your money back for those items. It’s all basically applying math and predicting what your volumes are going to be. If it’s an incremental cost that has one more block on it to support another interface, or making the block the Ethernet block with time triggered stuff in it for automotive, that gives you an incremental effort of X. Now, you have to basically estimate whether it also gives you a multiple of that incremental effort as incremental profit. It works out this way because chips just become very configurable. Chiplets are just going in the direction or finding the balance of more generic usage so that you can apply them in more chiplet designs.”

The chiplet market is far from certain today. “The promise of chiplets is that you use only the function that you want from the supplier that you want, in the right node, at the right location,” says Expedera’s Karazuba. “The idea of specialization and chiplets are at arm’s length. They’re actually together, but chiplets have a long way to go. There’s still not that universal agreement of the different things around a chiplet that have to be in order to make the product truly mass market.”

While chiplets have been proven to work, nearly all of the chiplets in use today are proprietary. “To build a viable [commercial] chiplet company, you have to be going after a broad enough market, large enough from a dollar perspective, then you can make all the investment, have success and get everything back accordingly,” says Blue Cheetah’s Alon. “There’s a similar tension where people would like to build a general-purpose chiplet that can be used anywhere, by anyone. That is the plug-and-play discussion, but you could finish up with something that becomes so general-purpose, with so much overhead, that it’s just not attractive in any particular market. In the chiplet case, for technical reasons, it might not actually really work that way at all. You might try to build it for general purpose, and it turns out later that it doesn’t plug into particular sockets that are of interest.”

The economics of chiplet viability have not yet been defined. “The thing about chiplets is they can be small,” says Klein. “Being small means that we don’t need as big a market for them as we would for a very large chip. We can also build them on different technologies. We can have some that are on older technologies, where transistors are cheaper, and we can combine those with other chiplets that might be leading-edge nodes where we could have general-purpose CPUs or NPU accelerators. There’s a mix-and-match, and we can do chiplets smaller than we can general-purpose chips. We can do smaller runs of them. We can take that IP and customize it for a particular market vertical and create some chiplets for that, change the configuration a bit, and do another run for something else. There’s a level of customization that can be deployed and supported by the market that’s a little bit more than we’ve seen in full-size chips, where the entire thing has to be built into one package.

Conclusion
What it means for a design to be general-purpose or custom is changing. All designs will contain some of each. Some companies will develop novel architectures using general-purpose processors, and these will be better than a fully general-purpose solution. Others will create highly customized hardware for some functions that are known to be stable, and general purpose for things that are likely to change. One thing has never changed, however. A company is not likely to add more customization than necessary to satisfy the needs of the market they are targeting.

Further Reading
Challenges With Chiplets And Power Delivery
Benefits and challenges in heterogeneous integration.
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.

The post Will Domain-Specific ICs Become Ubiquitous? appeared first on Semiconductor Engineering.

Reset Domain Crossing Verification

By Reetika and Sulabh Kumar Khare, Siemens EDA DI SW

To meet low-power and high-performance requirements, system on chip (SoC) designs are equipped with several asynchronous and soft reset signals. These reset signals help to safeguard software and hardware functional safety as they can be asserted to speedily recover the system onboard to an initial state and clear any pending errors or events.

By definition, a reset domain crossing (RDC) occurs when a path’s transmitting flop has an asynchronous reset, and the receiving flop either has a different asynchronous reset than the transmitting flop or has no reset. The multitude of asynchronous reset sources found in today’s complex automotive designs means there are a large number of RDC paths, which can lead to systematic faults and hence cause data-corruption, glitches, metastability, or functional failures — along with other issues.

This issue is not covered by standard, static verification methods, such as clock domain crossing (CDC) analysis. Therefore, a proper reset domain crossing verification methodology is required to prevent errors in the reset design during the RTL verification stage.

A soft reset is an internally generated reset (register/latch/black-box output is used as a reset) that allows the design engineer to reset a specific portion of the design (specific module/subsystem) without affecting the entire system. Design engineers frequently use a soft reset mechanism to reset/restart the device without fully powering it off, as this helps to conserve power by selectively resetting specific electronic components while keeping others in an operational state. A soft reset typically involves manipulating specific registers or signals to trigger the reset process. Applying soft resets is a common technique used to quickly recover from a problem or test a specific area of the design. This can save time during simulation and verification by allowing the designer to isolate and debug specific issues without having to restart the entire simulation. Figure 1 shows a simple soft reset and its RTL to demonstrate that SoftReg is a soft reset for flop Reg.

Fig. 1: SoftReg is a soft reset for register Reg.

This article presents a systematic methodology to identify RDCs, with different soft resets, that are unsafe, even though the asynchronous reset domain is the same on the transmitter and receiver ends. Also, with enough debug aids, we will identify the safe RDCs (safe from metastability only if it meets the static timing analysis), with different asynchronous reset domains, that help to avoid silicon failures and minimize false crossing results. As a part of static analysis, this systematic methodology enables designers to intelligently identify critical reset domain bugs associated with soft resets.

A methodology to identify critical reset domain bugs

With highly complex reset architectures in automotive designs, there arises the need for a proper verification method to detect RDC issues. It is essential to detect unsafe RDCs systematically and apply appropriate synchronization techniques to tackle the issues that may arise due to delays in reset paths caused by soft resets. Thus designers can ensure proper operation of their designs and avoid the associated risks. By handling RDCs effectively, designers can mitigate potential issues and enhance the overall robustness and performance of a design. This systematic flow involves several steps to assist in RDC verification closure using standard RDC verification tools (see figure 2).

Fig. 2: Flowchart of methodology for RDC verification.

Specification of clock and reset signals

Signals that are intended to generate a clock and reset pulse should be specified by the user as clock or reset signals, respectively, during the set-up step in RDC verification. By specifying signals as clocks or resets (according to their expected behavior), designers can perform design rule checking and other verification checks to ensure compliance with clock and reset related guidelines and standards as well as best practices. This helps identify potential design issues and improve the overall quality of the design by reducing noise in the results.

Clock detection

Ideally, design engineers should define the clock signals and then the verification tool should trace these clocks down to the leaf clocks. Unfortunately, with complex designs, this is not possible as the design might have black boxes that originate clocks, or it may have some combinational logic in the clock signals that do not cover all the clocks specified by the user. All the un-specified clocks need to be identified and mapped to the user-specified primary clocks. An exhaustive detection of clocks is required in RDC verification, as potential metastability may occur if resets are used in different clock domains than the sequential element itself, leading to critical bugs.

Reset detection

Ideally, design engineers should define the reset signals, but again, due to the complexity of automotive and other modern designs, it is not possible to specify all the reset signals. Therefore a specialized verification tool is required for detection of resets. All the localized, black-box, gated, and primary resets need to be identified, and based on their usage in the RTL, they should be classified as synchronous, asynchronous, or dual type and then mapped to the user-specified primary resets.

Soft reset detection

The soft resets — i.e., the internally generated resets by flops and latches — need to be systematically detected as they can cause critical metastability issues when used in different clock domains, and they require static timing analysis when used in the same clock domain. Detecting soft resets helps identify potential metastability problems and allows designers to apply proper techniques for resolving these issues.

Reset tree analysis

Analysis of reset trees helps designers identify issues early in the design process, before RDC analysis. It helps to highlight some important errors in the reset design that are not commonly caught by lint tools. These include:

  • Dual synchronicity reset signals, i.e., the reset signal with a sample synchronous reset flop and a sample asynchronous reset flop
  • An asynchronous set/reset signal used as a data signal can result in incorrect data sampling because the reset state cannot be controlled

Reset domain crossing analysis

This step involves analyzing a design to determine the logic across various reset domains and identify potential RDCs. The analysis should also identify common reset sequences of asynchronous and soft reset sources at the transmitter and receiver registers of the crossings to avoid detection of false crossings that might appear as potential issues due to complex combinations of reset sources. False crossings are where a transmitter register and receiver register are asserted simultaneously due to dependencies among the reset assertion sequences, and as a result, any metastability that might occur on the receiver end is mitigated.

Analyze and fix RDC issues

The concluding step is to analyze the results of the verification steps to verify if data paths crossing reset domains are safe from metastability. For the RDCs identified as unsafe — which may occur either due to different asynchronous reset domains at the transmitter and receiver ends or due to the soft reset being used in a different clock domain than the sequential element itself — design engineers can develop solutions to eliminate or mitigate metastability by restructuring the design, modifying reset synchronization logic, or adjusting the reset ordering. Traditionally safe RDCs — i.e., crossings where a soft reset is used in the same clock domain as the sequential element itself — need to be verified using static timing analysis.

Figure 3 presents our proposed flow for identifying and eliminating metastability issues due to soft resets. After implementing the RDC solutions, re-verify the design to ensure that the reset domain crossing issues have been effectively addressed.

Fig. 3: Flowchart for proposed methodology to tackle metastability issues due to soft resets.

This methodology was used on a design with 374,546 register bits, 8 latch bits, and 45 RAMs. The Questa RDC verification tool using this new methodology identified around 131 reset domains, which consisted of 19 asynchronous domains defined by the user, as well as 81 asynchronous reset domains inferred by the tool.

The first run analyzed data paths crossing asynchronous reset domains without any soft reset analysis. It reported nearly 40,000 RDC crossings (as shown in table 1).

Reset domain crossings without soft reset analysis Severity Number of crossings
Reset domain crossing from a reset to a reset Violation 28408
Reset domain crossing from a reset to non-reset Violation 11235

Table 1: RDC analysis without soft resets.

In the second run, we did soft reset analysis and detected 34 soft resets, which resulted in additional violations for RDC paths with transmitter soft reset sources in different clock domains. These were critical violations that were missed in the initial run. Also, some RDC violations were converted to cautions (RDC paths with a transmitter soft reset in the same clock domain) as these paths would be safe from metastability as long as they meet the setup time window (as shown in table 2).

Reset domain crossings with soft reset analysis Severity Number of crossings
Reset domain crossing from a reset to a reset Violation 26957
Reset domain crossing from a reset to non-reset Violation 10523
Reset domain crossing with tx reset source in different clock Violation 880
Reset domain crossing from a reset to Rx with same clock Caution 2412

Table 2: RDC analysis with soft resets.

To gain a deeper understanding of RDC, metastability, and soft reset analysis in the context of this new methodology, please download the full paper Techniques to identify reset metastability issues due to soft resets.

The post Reset Domain Crossing Verification appeared first on Semiconductor Engineering.

SRAM Security Concerns Grow

SRAM security concerns are intensifying as a combination of new and existing techniques allow hackers to tap into data for longer periods of time after a device is powered down.

This is particularly alarming as the leading edge of design shifts from planar SoCs to heterogeneous systems in package, such as those used in AI or edge processing, where chiplets frequently have their own memory hierarchy. Until now, most cybersecurity concerns involving volatile memory have focused on DRAM, because it is often external and easier to attack. SRAM, in contrast, does not contain a component as obviously vulnerable as a heat-sensitive capacitor, and in the past it has been harder to pinpoint. But as SoCs are disaggregated and more features are added into devices, SRAM is becoming a much bigger security concern.

The attack scheme is well understood. Known as cold boot, it was first identified in 2008, and is essentially a variant of a side-channel attack. In a cold boot approach, an attacker dumps data from internal SRAM to an external device, and then restarts the system from the external device with some code modification. “Cold boot is primarily targeted at SRAM, with the two primary defenses being isolation and in-memory encryption,” said Vijay Seshadri, distinguished engineer at Cycuity.

Compared with network-based attacks, such as DRAM’s rowhammer, cold boot is relatively simple. It relies on physical proximity and a can of compressed air.

The vulnerability was first described by Edward Felton, director of Princeton University’s Center for Information Technology Policy, J. Alex Halderman, currently director of the Center for Computer Security & Society at the University of Michigan, and colleagues. The breakthrough in their research was based on the growing realization in the engineering research community that data does not vanish from memory the moment a device is turned off, which until then was a common assumption. Instead, data in both DRAM and SRAM has a brief “remanence.”[1]

Using a cold boot approach, data can be retrieved, especially if an attacker sprays the chip with compressed air, cooling it enough to slow the degradation of the data. As the researchers described their approach, “We obtained surface temperatures of approximately −50°C with a simple cooling technique — discharging inverted cans of ‘canned air’ duster spray directly onto the chips. At these temperatures, we typically found that fewer than 1% of bits decayed even after 10 minutes without power.”

Unfortunately, despite nearly 20 years of security research since the publication of the Halderman paper, the authors’ warning still holds true. “Though we discuss several strategies for mitigating these risks, we know of no simple remedy that would eliminate them.”

However unrealistic, there is one simple and obvious remedy to cold boot — never leave a device unattended. But given human behavior, it’s safer to assume that every device is vulnerable, from smart watches to servers, as well as automotive chips used for increasingly autonomous driving.

While the original research exclusively examined DRAM, within the last six years cold boot has proven to be one of the most serious vulnerabilities for SRAM. In 2018, researchers at Germany’s Technische Universität Darmstadt published a paper describing a cold boot attack method that is highly resistant to memory erasure techniques, and which can be used to manipulate the cryptographic keys produced by the SRAM physical unclonable function (PUF).

As with so many security issues, it’s been a cat-and-mouse game between remedies and counter-attacks. And because cold boot takes advantage of slowing down memory degradation, in 2022 Yang-Kyu Choi and colleagues at the Korea Advanced Institute of Science and Technology (KAIST), described a way to undo the slowdown with an ultra-fast data sanitization method that worked within 5 ns, using back bias to control the device parameters of CMOS.

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Their paper, as well as others, have inspired new approaches to combating cold boot attacks.

“To mitigate the risk of unauthorized access from unknown devices, main devices, or servers, check the authenticated code and unique identity of each accessing device,” said Jongsin Yun, memory technologist at Siemens EDA. “SRAM PUF is one of the ways to securely identify each device. SRAM is made of two inverters cross-coupled to each other. Although each inverter is designed to be the same device, normally one part of the inverter has a somewhat stronger NMOS than the other due to inherent random dopant fluctuation. During the initial power-on process, SRAM data will be either data 1 or 0, depending on which side has a stronger device. In other words, the initial data state of the SRAM array at the power on is decided by this unique random process variation and most of the bits maintain this property for life. One can use this unique pattern as a fingerprint of a device. The SRAM PUF data is reconstructed with other coded data to form a cryptographic key. SRAM PUF is a great way to anchor its secure data into hardware. Hackers may use a DFT circuit to access the memory. To avoid insecurely reading the SRAM information through DFT, the security-critical design makes DFT force delete the data as an initial process of TEST mode.”

However, there can be instances where data may be required to be kept in a non-volatile memory (NVM). “Data is considered insecure if the NVM is located outside of the device,” said Yun. “Therefore, secured data needs to be stored within the device with write protection. One-time programmable (OTP) memory or fuses are good storage options to prevent malicious attackers from tampering with the modified information. OTP memory and fuses are used to store cryptographic keys, authentication information, and other critical settings for operation within the device. It is useful for anti-rollback, which prevents hackers from exploiting old vulnerabilities that have been fixed in newer versions.”

Chiplet vulnerabilities
Chiplets also could present another vector for attack, due to their complexity and interconnections. “A chiplet has memory, so it’s going to be attacked,” said Cycuity’s Seshadri. “Chiplets, in general, are going to exacerbate the problem, rather than keeping it status quo, because you’re going to have one chiplet talking to another. Could an attack on one chiplet have a side effect on another? There need to be standards to address this. In fact, they’re coming into play already. A chiplet provider has to say, ‘Here’s what I’ve done for security. Here’s what needs to be done when interfacing with another chiplet.”

Yun notes there is a further physical vulnerability for those working with chiplets and SiPs. “When multiple chiplets are connected to form a SiP, we have to trust data coming from an external chip, which creates further complications. Verification of the chiplet’s authenticity becomes very important for SiPs, as there is a risk of malicious counterfeit chiplets being connected to the package for hacking purposes. Detection of such counterfeit chiplets is imperative.”

These precautions also apply when working with DRAM. In all situations, Seshardi said, thinking about security has to go beyond device-level protection. “The onus of protecting DRAM is not just on the DRAM designer or the memory designer,” he said. “It has to be secured by design principles when you are developing. In addition, you have to look at this holistically and do it at a system level. You must consider all the other things that communicate with DRAM or that are placed near DRAM. You must look at a holistic solution, all the way from software down to things like the memory controller and then finally, the DRAM itself.”

Encryption as a backup
Data itself always must be encrypted as second layer of protection against known and novel attacks, so an organization’s assets will still be protected even if someone breaks in via cold boot or another method.

“The first and primary method of preventing a cold boot attack is limiting physical access to the systems, or physically modifying the systems case or hardware preventing an attacker’s access,” said Jim Montgomery, market development director, semiconductor at TXOne Networks. “The most effective programmatic defense against an attack is to ensure encryption of memory using either a hardware- or software-based approach. Utilizing memory encryption will ensure that regardless of trying to dump the memory, or physically removing the memory, the encryption keys will remain secure.”

Montgomery also points out that TXOne is working with the Semiconductor Manufacturing Cybersecurity Consortium (SMCC) to develop common criteria based upon SEMI E187 and E188 standards to assist DM’s and OEM’s to implement secure procedures for systems security and integrity, including controlling the physical environment.

What kind and how much encryption will depend on use cases, said Jun Kawaguchi, global marketing executive for Winbond. “Encryption strength for a traffic signal controller is going to be different from encryption for nuclear plants or medical devices, critical applications where you need much higher levels,” he said. “There are different strengths and costs to it.”

Another problem, in the post-quantum era, is that encryption itself may be vulnerable. To defend against those possibilities, researchers are developing post-quantum encryption schemes. One way to stay a step ahead is homomorphic encryption [HE], which will find a role in data sharing, since computations can be performed on encrypted data without first having to decrypt it.

Homomorphic encryption could be in widespread use as soon as the next few years, according to Ronen Levy, senior manager for IBM’s Cloud Security & Privacy Technologies Department, and Omri Soceanu, AI Security Group manager at IBM.  However, there are still challenges to be overcome.

“There are three main inhibitors for widespread adoption of homomorphic encryption — performance, consumability, and standardization,” according to Levy. “The main inhibitor, by far, is performance. Homomorphic encryption comes with some latency and storage overheads. FHE hardware acceleration will be critical to solving these issues, as well as algorithmic and cryptographic solutions, but without the necessary expertise it can be quite challenging.”

An additional issue is that most consumers of HE technology, such as data scientists and application developers, do not possess deep cryptographic skills, HE solutions that are designed for cryptographers can be impractical. A few HE solutions require algorithmic and cryptographic expertise that inhibit adoption by those who lack these skills.

Finally, there is a lack of standardization. “Homomorphic encryption is in the process of being standardized,” said Soceanu. “But until it is fully standardized, large organizations may be hesitant to adopt a cryptographic solution that has not been approved by standardization bodies.”

Once these issues are resolved, they predicted widespread use as soon as the next few years. “Performance is already practical for a variety of use cases, and as hardware solutions for homomorphic encryption become a reality, more use cases would become practical,” said Levy. “Consumability is addressed by creating more solutions, making it easier and hopefully as frictionless as possible to move analytics to homomorphic encryption. Additionally, standardization efforts are already in progress.”

A new attack and an old problem
Unfortunately, security never will be as simple as making users more aware of their surroundings. Otherwise, cold boot could be completely eliminated as a threat. Instead, it’s essential to keep up with conference talks and the published literature, as graduate students keep probing SRAM for vulnerabilities, hopefully one step ahead of genuine attackers.

For example, SRAM-related cold boot attacks originally targeted discrete SRAM. The reason is that it’s far more complicated to attack on-chip SRAM, which is isolated from external probing and has minimal intrinsic capacitance. However, in 2022, Jubayer Mahmod, then a graduate student at Virginia Tech and his advisor, associate professor Matthew Hicks, demonstrated what they dubbed “Volt Boot,” a new method that could penetrate on-chip SRAM. According to their paper, “Volt Boot leverages asymmetrical power states (e.g., on vs. off) to force SRAM state retention across power cycles, eliminating the need for traditional cold boot attack enablers, such as low-temperature or intrinsic data retention time…Unlike other forms of SRAM data retention attacks, Volt Boot retrieves data with 100% accuracy — without any complex post-processing.”

Conclusion
While scientists and engineers continue to identify vulnerabilities and develop security solutions, decisions about how much security to include in a design is an economic one. Cost vs. risk is a complex formula that depends on the end application, the impact of a breach, and the likelihood that an attack will occur.

“It’s like insurance,” said Kawaguchi. “Security engineers and people like us who are trying to promote security solutions get frustrated because, similar to insurance pitches, people respond with skepticism. ‘Why would I need it? That problem has never happened before.’ Engineers have a hard time convincing their managers to spend that extra dollar on the costs because of this ‘it-never-happened-before’ attitude. In the end, there are compromises. Yet ultimately, it’s going to cost manufacturers a lot of money when suddenly there’s a deluge of demands to fix this situation right away.”

References

  1. S. Skorobogatov, “Low temperature data remanence in static RAM”, Technical report UCAM-CL-TR-536, University of Cambridge Computer Laboratory, June 2002.
  2. Han, SJ., Han, JK., Yun, GJ. et al. Ultra-fast data sanitization of SRAM by back-biasing to resist a cold boot attack. Sci Rep 12, 35 (2022). https://doi.org/10.1038/s41598-021-03994-2

The post SRAM Security Concerns Grow appeared first on Semiconductor Engineering.

Software-Defined Vehicle Momentum Grows

Experts at the Table: The automotive ecosystem is undergoing a transformation toward software-defined vehicles, spurring new architectures with more software. Semiconductor Engineering sat down to discuss the impact of these changes with Suraj Gajendra, vice president of products and solutions in Arm‘s automotive line of business; Chuck Alpert, R&D automotive fellow at Cadence; Steve Spadoni, zone controller and power distribution application manager at Infineon; Rebeca Delgado, chief technology officer and principal AI engineer at Intel Automotive; Cyril Clocher, senior director in the automotive product line for high-performance computing at Renesas; David Fritz, vice president, hybrid and virtual systems at Siemens EDA; and Marc Serughetti, senior director, systems design group at Synopsys. What follows are excerpts of that discussion.

L-R: Arm’s Gajendra, Cadence’s Alpert, Infineon’s Spadoni, Intel’s Delgado, Renesas’ Clocher, Siemens’ Fritz, Synopsys’ Serughetti.

L-R: Arm’s Gajendra, Cadence’s Alpert, Infineon’s Spadoni, Intel’s Delgado, Renesas’ Clocher, Siemens’ Fritz, Synopsys’ Serughetti.

SE: The automotive ecosystem is undergoing a technology evolution the likes of which has not been seen, including the move to software-defined vehicles. To set a baseline for this discussion, what is your definition of an SDV?

Gajendra: A software-defined vehicle is a concept, a trend, an idea, where the whole ecosystem can drive new capabilities and new user experiences into the car, even after it rolls out of the showroom or dealership. It’s a pretty loaded concept. There’s a lot of infrastructure that needs to come together, such as software development in the cloud, seamless deployment of that software development onto the car, the whole deployment of over-the-air updates, and the connectivity. In short, the concept of a software-defined vehicle is expecting a world where we can drive new experiences, new capabilities, and new features into the car throughout its lifetime.

Alpert: In thinking about what SDV means, one example is the battery — especially in an EV. I’m not talking about the technology of the battery that’s evolved, but rather the idea that in the past when you wanted to charge your car in your garage and you were worried about starting a fire, you’d think, ‘No, don’t do that because your whole house could burn down.’ The idea is that in the past, maybe we might put a temperature sensor on the battery, but now we actually have software that can monitor it. It might even have AI to predict if the battery is reaching some state that might cause a fire in the future. You also might have something that connects to the power grid and learns when is a good time to charge, because it’s a low-usage period so it’s cheaper. This is just one part of the car, but you can imagine a whole bunch of software that you want to put on top of it in order to connect to the universe. You need a software-defined vehicle platform in order for this, or in all the other parts of your car, to communicate with the world and provide the best user experience.

Spadoni: Infineon’s definition of a software-defined vehicle is a redefining of architecture — specifically, electrical and electronic architecture, feature allocation, and the entire topology of the vehicle, from power generation and storage to power distribution and high compute. It really means new electrical architectures, and it has consequences for the business model of every OEM and Tier 1 involved. It’s a major change to previous methodologies in the last 30 years.

Delgado: Software-defined vehicle is not just over-the-air updates. It’s truly a new methodology and a new philosophy for how to architect every ingredient of the vehicle to continue to deliver value over time, in which the value is very tightly attached to the software that delivers the user experience. Ultimately, this architecture must enable the different practices on how to deliver this new value over time. What’s very interesting is that these practices of moving to software-defined architecture has been done by many other industries already. Intel has a ton of heritage, and actually helped those industries transform. That transformation is truly what we’re observing here. It’s an incredible opportunity, and possibly a crisis if not done right.

Clocher: To apply an analogy here, the car is the new smartphone. But for us, it’s more than that. I’ve heard about the platform, yes, and it’s the major architecture evolution that we’ll see in the next decade. For us at Renesas, it will be a journey that will take time to enhance the user experience, to generate new revenue streams for the industry as it moves from decentralized to centralized classic compute with zonal architecture. We can apply all those buzzwords to a software-defined vehicle. Those platform will need big computers and heavy complex hardware solutions and this will generate evolutions, upgrades to the car during its entire lifetime, but underneath we know — at least at Renesas, and certainly at some other players and silicon vendors — that this will need a huge amount of hardware resources to manage what we have in mind to deploy this platform.

Fritz: I see software-defined vehicles a bit differently than what’s been mentioned so far. For many years, you’d have the hardware team doing their design, and the software team doing their design, and it all needs to come together. There’s an English natural language discussion about what needs to happen, and as we all know, that never really goes terribly well. In automotive that becomes an integration storm, and it is a nightmare. With the new compute requirements that have been mentioned already, that just compounds the issue. So the way I see this is that we tend, as people who have an engineering background, to dive into how we’re going to do things. We hear ‘software-defined vehicle,’ we immediately think about how to do that. There’s not a lot of thought about why it needs to be done, and what needs to happen. We jump into the ‘how’ too early, and a lot of the discussion here is exemplary of that kind of approach. When I’m looking at software-defined vehicles, I’m looking at why it’s important that the software needs to run effectively on a piece of hardware. And for that hardware, why is it important for it to actually operate properly on the software? Then you can decide how to put together a new methodology that’s going to bring those things together. In the past, it’s been called hardware/software co-design. There have been attempts many times, and as has been mentioned, other industries have made this transition. What’s unique about automotive is that it’s not just one transition that needs to happen. It’s hundreds or thousands of transitions. The ecosystem needs to be turned upside down, which we’re seeing happen right now, and you need to bring all that together. It really is a methodology where you need the tooling, you need the processes, you need the thinking, you need the organizations to change so that they can make this transition in a realistic way. SDV is a huge transition. It is a way for the automotive industry to morph into something that has longevity and can meet customer expectations, which it really hasn’t met for some time now.

Serughetti: At the end of the day, if we look starting at the top from our perspective, SDV is a means to bring and enhance the car experience for the customer. That’s the end result that the OEMs look at, but they look at it from the perspective of how that improves the OEM efficiencies, and how that creates new business opportunities. The way we look at it, and what’s important, is the impact it has on the industry, the impact on the processes, on the methodologies, on the people, on the ecosystem, on the technology. It’s really a transformation of the automotive market that is going to fundamentally change how the industry moves forward and bring the OEM into a world in which they are really looking at how they become efficient in delivering cars, how they bring new features, but at the same time, how they evolve their business as well.

SE: As you’ve all described, SDV requires many inter-dependencies, and the entire ecosystem has to have an understanding of the ‘why,’ which should then lead back to laying out the plan for how to get there. Where does the ecosystem stand today in terms of realizing SDV?

Fritz: OEMs have decided in the last few years that they’ve got to take control of their own destiny. They cannot simply take what the suppliers provide. They need a methodology — like this whole SDV concept, and any tooling necessary to provide that — to push down into their suppliers, such that, ‘Here’s what I need. If you can’t do this for me, I will go find someone that will.’ This is not the old ecosystem that bubbled up from the IP to the Tier 2s, to the Tier 1s, and then to the OEMs, which gave them limited choices to go from. So when I say, “Turn the ecosystem upside down,” that’s what is happening. But every OEM has their own ecosystem, and they’re not all in the same place. Even region-to-region, they can be very different.

Delgado: This is a critical discussion, and effectively where the industry has to eventually settle. The magnitude of the transformation of the ecosystem includes roles in the technology evolution. The silicon content is expected to quadruple over the next few years in the vehicle for defining the in-cabin experience of the end user. At the end of the day, the complexity of the transition of roles is of such magnitude that the proprietary, fragmented, and broken approaches that David articulated are really not going to enable the industry to transform at the speed it requires to deliver and meet the experiences. But more than anything, they are not going to address the actual technology changes necessary to implement and allow for this value delivery mechanism. At the end of the day, this is where Intel really believes collaboration is key, and anybody who wants to participate in this ecosystem must provide scalability — also known as top-to-bottom support of the different product lines that our OEMs and Tier 1s are having to support, versus a broken-up approach on these ever-evolving higher performance and higher performance compute needs. It has to be future-proof, because you’re going to launch the vehicle eventually. So certain hardware has to be future-proofed to a certain affordability envelope, and there has to be a strategy around that. And then the ecosystem and that collaboration must be able to deliver that aggregation. It has to be done with certain anchoring technology that will allow us to deliver that performance. Collaboration is key in the sense that these technologies cannot be single-handedly owned, developed, let alone owned, defined, developed, and integrated by OEMs in silos with a proprietary end-to-end architecture definition. There obviously will be differentiations on the actual implementation, but the technologies at large have to have a sense of reuse, particularly from other verticals that have already done software-defined transformations and then tuned in the right ways toward the automotive requirements.

Spadoni: There are probably a wide variety of implementations. At Infineon, we partner with OEMs and Tier 1s and we see different approaches. For example, General Motors has more of a modular approach that emulates what happened in in the mobile phone space. It seems that Ford has a more pragmatic approach, along with Stellantis, but all of them are facing very similar challenges in that affordability has become a big problem. There are multiple generations of implementations that are going to occur, and you’ll see a striving toward how to pay for this extra hardware. It leads to tradeoffs in implementations of other systems that have to have savings in order for them to afford these vehicles. No one ever goes into a dealership and says, ‘Give me a software-defined vehicle.’ Everyone’s looking for value, and you can see it now with volumes going down. There’s a saturation of people buying at the high level. The OEMs want to get more sales, which means they’ll have to go to the lower-cost-value vehicles, and that’s going to affect the electrical and electronic architectures and the software-defined vehicle.

Clocher: What we’re seeing I would summarize as the impact on the ecosystem. We’re moving to an OEM-centric ecosystem. One size does not fit all, meaning OEMs will have their different tastes, their different definitions of levels of integration they want to have in their software-defined vehicle — especially given more complex tasks that we all have to do, rather than the challenge we have to solve, because we’re not talking about a common umbrella of software-defined vehicle. But it really does mean different implementations and different meanings for OEM A from OEM B. I would fully agree with David and Steve that we are far from having a common understanding of, at least, the market itself. And that’s fine, because this will bring differentiation, and ultimately that’s why a customer will go to Dealership A versus Dealership B. This is what the industry wants to see — continue to differentiate, continue to add value to the ultimate product, which is the car.

Serughetti: The important point in all this is, of course, you’re breaking the model that exists today. That’s one of the big challenges. We used to have Tier 1s that were building boxes, and delivering software. This was a complete black box. When it would go to integration, there were all sorts of problems. And now you’re going to break this? The challenge for the OEM is how they do this. They want to control software, but are they equipped to do this today? We see the problems today that some of the legacy OEMs have in setting up their software organizations, the challenges of CARIAD and all such organizations that are trying to do this. It’s not easy to change those companies. Of course, the new entrants don’t have this problem because they are coming from a brand new design versus the ones that deal with legacy. So for the OEM, it’s about how to take control of the software. What does that mean in terms of the processes, in terms of agile development, digital twins, and all of these technologies everybody’s talking about? The other side is, ‘It’s all nice, this software,’ but this software runs on all the companies that are delivering hardware, and that becomes essential to it. You can have the best software, but if your hardware is not there to support performance, power, and all of those aspects, you’re not going to be successful. So the ecosystem is evolving how hardware, software, and all of this comes together. The OEM wants to be the central point. That’s what we’re talking about in terms of the process methodology aspects that are making this transition evolve.

Gajendra: Where are we in this journey? How far have we come? And where are we going? Going back to the point that David mentioned earlier about supply chain evolving and the supply chain turned upside down, five years ago, if we sat here in this sort of a panel and discussed software-defined vehicles, the conversation would have been entirely different. It would have been stuck with the traditional supply chain that we’ve seen for the last 35 or 40 years in the automotive industry. There are fundamentally two aspects here. The supply chain is evolving, and the infrastructure that we, as a community — this team, for example, and many others in the community — are trying to enable is going to be key to making our EDA partners happy. The use of virtual platforms today in the cloud to try and shift left and develop and validate some of these technologies and software wasn’t even there five years ago, so we’ve come a long way. We’ve made a lot of progress together as an industry. Yes, we have a long way to go until we actually have a truly software-defined vehicle. We can go and ask for a software-defined vehicle in the dealership. But the changes we are seeing in terms of all sorts of technology providers trying to make sure that the technology that we eventually will have in the hardware is provided in some sort of virtual form, be it fast models or whatever it is in the cloud, for the vast majority of software ecosystem in automotive this is a big change. I was at Embedded World, and the amount of virtual platforms and the demos that people were actually showing — silicon partners like we have here, Intel, Renesas, Infineon, EDA companies — pointed to a strong movement of, ‘Let’s build the infrastructure that we can build, and then provide that infrastructure to the OEMs to take it from there.’ There is a lot of work going on. Together we will make the infrastructure across the board, be it virtual platform or others, richer and more capable.

Alpert: For sure, OEMs have to control their own destiny. In the past, they would do it by differentiating maybe because they had better engine performance, or some other feature. But going forward, the differentiation is going to be their software. Whoever can make software that will provide additional value, and brand it, that’s going to be the differentiator and that’s the trend. In terms of how you get there, a shared ecosystem is important. SOAFEE is a potential way that, together with virtual platforms, you can provide a shared ecosystem for development, but still allow everyone to differentiate and plug-and-play. That’s one reason we’re working closely with Arm on trying to have a reference design specifically for this purpose. But again, we’re not saying, ‘This is the design you use. This is how you do it.’ That’s not it. The point is, let’s start somewhere, and then people can start swapping out pieces and doing different things. As long as OEMs can plug-and-play, then they can still differentiate. But they don’t have to invent everything themselves, which would be too costly.

Related Reading
Software-Defined Vehicles Ready To Roll
New approach could have big effects on cost, safety, security, and time to market.

The post Software-Defined Vehicle Momentum Grows appeared first on Semiconductor Engineering.

SRAM Security Concerns Grow

SRAM security concerns are intensifying as a combination of new and existing techniques allow hackers to tap into data for longer periods of time after a device is powered down.

This is particularly alarming as the leading edge of design shifts from planar SoCs to heterogeneous systems in package, such as those used in AI or edge processing, where chiplets frequently have their own memory hierarchy. Until now, most cybersecurity concerns involving volatile memory have focused on DRAM, because it is often external and easier to attack. SRAM, in contrast, does not contain a component as obviously vulnerable as a heat-sensitive capacitor, and in the past it has been harder to pinpoint. But as SoCs are disaggregated and more features are added into devices, SRAM is becoming a much bigger security concern.

The attack scheme is well understood. Known as cold boot, it was first identified in 2008, and is essentially a variant of a side-channel attack. In a cold boot approach, an attacker dumps data from internal SRAM to an external device, and then restarts the system from the external device with some code modification. “Cold boot is primarily targeted at SRAM, with the two primary defenses being isolation and in-memory encryption,” said Vijay Seshadri, distinguished engineer at Cycuity.

Compared with network-based attacks, such as DRAM’s rowhammer, cold boot is relatively simple. It relies on physical proximity and a can of compressed air.

The vulnerability was first described by Edward Felton, director of Princeton University’s Center for Information Technology Policy, J. Alex Halderman, currently director of the Center for Computer Security & Society at the University of Michigan, and colleagues. The breakthrough in their research was based on the growing realization in the engineering research community that data does not vanish from memory the moment a device is turned off, which until then was a common assumption. Instead, data in both DRAM and SRAM has a brief “remanence.”[1]

Using a cold boot approach, data can be retrieved, especially if an attacker sprays the chip with compressed air, cooling it enough to slow the degradation of the data. As the researchers described their approach, “We obtained surface temperatures of approximately −50°C with a simple cooling technique — discharging inverted cans of ‘canned air’ duster spray directly onto the chips. At these temperatures, we typically found that fewer than 1% of bits decayed even after 10 minutes without power.”

Unfortunately, despite nearly 20 years of security research since the publication of the Halderman paper, the authors’ warning still holds true. “Though we discuss several strategies for mitigating these risks, we know of no simple remedy that would eliminate them.”

However unrealistic, there is one simple and obvious remedy to cold boot — never leave a device unattended. But given human behavior, it’s safer to assume that every device is vulnerable, from smart watches to servers, as well as automotive chips used for increasingly autonomous driving.

While the original research exclusively examined DRAM, within the last six years cold boot has proven to be one of the most serious vulnerabilities for SRAM. In 2018, researchers at Germany’s Technische Universität Darmstadt published a paper describing a cold boot attack method that is highly resistant to memory erasure techniques, and which can be used to manipulate the cryptographic keys produced by the SRAM physical unclonable function (PUF).

As with so many security issues, it’s been a cat-and-mouse game between remedies and counter-attacks. And because cold boot takes advantage of slowing down memory degradation, in 2022 Yang-Kyu Choi and colleagues at the Korea Advanced Institute of Science and Technology (KAIST), described a way to undo the slowdown with an ultra-fast data sanitization method that worked within 5 ns, using back bias to control the device parameters of CMOS.

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Their paper, as well as others, have inspired new approaches to combating cold boot attacks.

“To mitigate the risk of unauthorized access from unknown devices, main devices, or servers, check the authenticated code and unique identity of each accessing device,” said Jongsin Yun, memory technologist at Siemens EDA. “SRAM PUF is one of the ways to securely identify each device. SRAM is made of two inverters cross-coupled to each other. Although each inverter is designed to be the same device, normally one part of the inverter has a somewhat stronger NMOS than the other due to inherent random dopant fluctuation. During the initial power-on process, SRAM data will be either data 1 or 0, depending on which side has a stronger device. In other words, the initial data state of the SRAM array at the power on is decided by this unique random process variation and most of the bits maintain this property for life. One can use this unique pattern as a fingerprint of a device. The SRAM PUF data is reconstructed with other coded data to form a cryptographic key. SRAM PUF is a great way to anchor its secure data into hardware. Hackers may use a DFT circuit to access the memory. To avoid insecurely reading the SRAM information through DFT, the security-critical design makes DFT force delete the data as an initial process of TEST mode.”

However, there can be instances where data may be required to be kept in a non-volatile memory (NVM). “Data is considered insecure if the NVM is located outside of the device,” said Yun. “Therefore, secured data needs to be stored within the device with write protection. One-time programmable (OTP) memory or fuses are good storage options to prevent malicious attackers from tampering with the modified information. OTP memory and fuses are used to store cryptographic keys, authentication information, and other critical settings for operation within the device. It is useful for anti-rollback, which prevents hackers from exploiting old vulnerabilities that have been fixed in newer versions.”

Chiplet vulnerabilities
Chiplets also could present another vector for attack, due to their complexity and interconnections. “A chiplet has memory, so it’s going to be attacked,” said Cycuity’s Seshadri. “Chiplets, in general, are going to exacerbate the problem, rather than keeping it status quo, because you’re going to have one chiplet talking to another. Could an attack on one chiplet have a side effect on another? There need to be standards to address this. In fact, they’re coming into play already. A chiplet provider has to say, ‘Here’s what I’ve done for security. Here’s what needs to be done when interfacing with another chiplet.”

Yun notes there is a further physical vulnerability for those working with chiplets and SiPs. “When multiple chiplets are connected to form a SiP, we have to trust data coming from an external chip, which creates further complications. Verification of the chiplet’s authenticity becomes very important for SiPs, as there is a risk of malicious counterfeit chiplets being connected to the package for hacking purposes. Detection of such counterfeit chiplets is imperative.”

These precautions also apply when working with DRAM. In all situations, Seshardi said, thinking about security has to go beyond device-level protection. “The onus of protecting DRAM is not just on the DRAM designer or the memory designer,” he said. “It has to be secured by design principles when you are developing. In addition, you have to look at this holistically and do it at a system level. You must consider all the other things that communicate with DRAM or that are placed near DRAM. You must look at a holistic solution, all the way from software down to things like the memory controller and then finally, the DRAM itself.”

Encryption as a backup
Data itself always must be encrypted as second layer of protection against known and novel attacks, so an organization’s assets will still be protected even if someone breaks in via cold boot or another method.

“The first and primary method of preventing a cold boot attack is limiting physical access to the systems, or physically modifying the systems case or hardware preventing an attacker’s access,” said Jim Montgomery, market development director, semiconductor at TXOne Networks. “The most effective programmatic defense against an attack is to ensure encryption of memory using either a hardware- or software-based approach. Utilizing memory encryption will ensure that regardless of trying to dump the memory, or physically removing the memory, the encryption keys will remain secure.”

Montgomery also points out that TXOne is working with the Semiconductor Manufacturing Cybersecurity Consortium (SMCC) to develop common criteria based upon SEMI E187 and E188 standards to assist DM’s and OEM’s to implement secure procedures for systems security and integrity, including controlling the physical environment.

What kind and how much encryption will depend on use cases, said Jun Kawaguchi, global marketing executive for Winbond. “Encryption strength for a traffic signal controller is going to be different from encryption for nuclear plants or medical devices, critical applications where you need much higher levels,” he said. “There are different strengths and costs to it.”

Another problem, in the post-quantum era, is that encryption itself may be vulnerable. To defend against those possibilities, researchers are developing post-quantum encryption schemes. One way to stay a step ahead is homomorphic encryption [HE], which will find a role in data sharing, since computations can be performed on encrypted data without first having to decrypt it.

Homomorphic encryption could be in widespread use as soon as the next few years, according to Ronen Levy, senior manager for IBM’s Cloud Security & Privacy Technologies Department, and Omri Soceanu, AI Security Group manager at IBM.  However, there are still challenges to be overcome.

“There are three main inhibitors for widespread adoption of homomorphic encryption — performance, consumability, and standardization,” according to Levy. “The main inhibitor, by far, is performance. Homomorphic encryption comes with some latency and storage overheads. FHE hardware acceleration will be critical to solving these issues, as well as algorithmic and cryptographic solutions, but without the necessary expertise it can be quite challenging.”

An additional issue is that most consumers of HE technology, such as data scientists and application developers, do not possess deep cryptographic skills, HE solutions that are designed for cryptographers can be impractical. A few HE solutions require algorithmic and cryptographic expertise that inhibit adoption by those who lack these skills.

Finally, there is a lack of standardization. “Homomorphic encryption is in the process of being standardized,” said Soceanu. “But until it is fully standardized, large organizations may be hesitant to adopt a cryptographic solution that has not been approved by standardization bodies.”

Once these issues are resolved, they predicted widespread use as soon as the next few years. “Performance is already practical for a variety of use cases, and as hardware solutions for homomorphic encryption become a reality, more use cases would become practical,” said Levy. “Consumability is addressed by creating more solutions, making it easier and hopefully as frictionless as possible to move analytics to homomorphic encryption. Additionally, standardization efforts are already in progress.”

A new attack and an old problem
Unfortunately, security never will be as simple as making users more aware of their surroundings. Otherwise, cold boot could be completely eliminated as a threat. Instead, it’s essential to keep up with conference talks and the published literature, as graduate students keep probing SRAM for vulnerabilities, hopefully one step ahead of genuine attackers.

For example, SRAM-related cold boot attacks originally targeted discrete SRAM. The reason is that it’s far more complicated to attack on-chip SRAM, which is isolated from external probing and has minimal intrinsic capacitance. However, in 2022, Jubayer Mahmod, then a graduate student at Virginia Tech and his advisor, associate professor Matthew Hicks, demonstrated what they dubbed “Volt Boot,” a new method that could penetrate on-chip SRAM. According to their paper, “Volt Boot leverages asymmetrical power states (e.g., on vs. off) to force SRAM state retention across power cycles, eliminating the need for traditional cold boot attack enablers, such as low-temperature or intrinsic data retention time…Unlike other forms of SRAM data retention attacks, Volt Boot retrieves data with 100% accuracy — without any complex post-processing.”

Conclusion
While scientists and engineers continue to identify vulnerabilities and develop security solutions, decisions about how much security to include in a design is an economic one. Cost vs. risk is a complex formula that depends on the end application, the impact of a breach, and the likelihood that an attack will occur.

“It’s like insurance,” said Kawaguchi. “Security engineers and people like us who are trying to promote security solutions get frustrated because, similar to insurance pitches, people respond with skepticism. ‘Why would I need it? That problem has never happened before.’ Engineers have a hard time convincing their managers to spend that extra dollar on the costs because of this ‘it-never-happened-before’ attitude. In the end, there are compromises. Yet ultimately, it’s going to cost manufacturers a lot of money when suddenly there’s a deluge of demands to fix this situation right away.”

References

  1. S. Skorobogatov, “Low temperature data remanence in static RAM”, Technical report UCAM-CL-TR-536, University of Cambridge Computer Laboratory, June 2002.
  2. Han, SJ., Han, JK., Yun, GJ. et al. Ultra-fast data sanitization of SRAM by back-biasing to resist a cold boot attack. Sci Rep 12, 35 (2022). https://doi.org/10.1038/s41598-021-03994-2

The post SRAM Security Concerns Grow appeared first on Semiconductor Engineering.

Software-Defined Vehicle Momentum Grows

Experts at the Table: The automotive ecosystem is undergoing a transformation toward software-defined vehicles, spurring new architectures with more software. Semiconductor Engineering sat down to discuss the impact of these changes with Suraj Gajendra, vice president of products and solutions in Arm‘s automotive line of business; Chuck Alpert, R&D automotive fellow at Cadence; Steve Spadoni, zone controller and power distribution application manager at Infineon; Rebeca Delgado, chief technology officer and principal AI engineer at Intel Automotive; Cyril Clocher, senior director in the automotive product line for high-performance computing at Renesas; David Fritz, vice president, hybrid and virtual systems at Siemens EDA; and Marc Serughetti, senior director, systems design group at Synopsys. What follows are excerpts of that discussion.

L-R: Arm’s Gajendra, Cadence’s Alpert, Infineon’s Spadoni, Intel’s Delgado, Renesas’ Clocher, Siemens’ Fritz, Synopsys’ Serughetti.

L-R: Arm’s Gajendra, Cadence’s Alpert, Infineon’s Spadoni, Intel’s Delgado, Renesas’ Clocher, Siemens’ Fritz, Synopsys’ Serughetti.

SE: The automotive ecosystem is undergoing a technology evolution the likes of which has not been seen, including the move to software-defined vehicles. To set a baseline for this discussion, what is your definition of an SDV?

Gajendra: A software-defined vehicle is a concept, a trend, an idea, where the whole ecosystem can drive new capabilities and new user experiences into the car, even after it rolls out of the showroom or dealership. It’s a pretty loaded concept. There’s a lot of infrastructure that needs to come together, such as software development in the cloud, seamless deployment of that software development onto the car, the whole deployment of over-the-air updates, and the connectivity. In short, the concept of a software-defined vehicle is expecting a world where we can drive new experiences, new capabilities, and new features into the car throughout its lifetime.

Alpert: In thinking about what SDV means, one example is the battery — especially in an EV. I’m not talking about the technology of the battery that’s evolved, but rather the idea that in the past when you wanted to charge your car in your garage and you were worried about starting a fire, you’d think, ‘No, don’t do that because your whole house could burn down.’ The idea is that in the past, maybe we might put a temperature sensor on the battery, but now we actually have software that can monitor it. It might even have AI to predict if the battery is reaching some state that might cause a fire in the future. You also might have something that connects to the power grid and learns when is a good time to charge, because it’s a low-usage period so it’s cheaper. This is just one part of the car, but you can imagine a whole bunch of software that you want to put on top of it in order to connect to the universe. You need a software-defined vehicle platform in order for this, or in all the other parts of your car, to communicate with the world and provide the best user experience.

Spadoni: Infineon’s definition of a software-defined vehicle is a redefining of architecture — specifically, electrical and electronic architecture, feature allocation, and the entire topology of the vehicle, from power generation and storage to power distribution and high compute. It really means new electrical architectures, and it has consequences for the business model of every OEM and Tier 1 involved. It’s a major change to previous methodologies in the last 30 years.

Delgado: Software-defined vehicle is not just over-the-air updates. It’s truly a new methodology and a new philosophy for how to architect every ingredient of the vehicle to continue to deliver value over time, in which the value is very tightly attached to the software that delivers the user experience. Ultimately, this architecture must enable the different practices on how to deliver this new value over time. What’s very interesting is that these practices of moving to software-defined architecture has been done by many other industries already. Intel has a ton of heritage, and actually helped those industries transform. That transformation is truly what we’re observing here. It’s an incredible opportunity, and possibly a crisis if not done right.

Clocher: To apply an analogy here, the car is the new smartphone. But for us, it’s more than that. I’ve heard about the platform, yes, and it’s the major architecture evolution that we’ll see in the next decade. For us at Renesas, it will be a journey that will take time to enhance the user experience, to generate new revenue streams for the industry as it moves from decentralized to centralized classic compute with zonal architecture. We can apply all those buzzwords to a software-defined vehicle. Those platform will need big computers and heavy complex hardware solutions and this will generate evolutions, upgrades to the car during its entire lifetime, but underneath we know — at least at Renesas, and certainly at some other players and silicon vendors — that this will need a huge amount of hardware resources to manage what we have in mind to deploy this platform.

Fritz: I see software-defined vehicles a bit differently than what’s been mentioned so far. For many years, you’d have the hardware team doing their design, and the software team doing their design, and it all needs to come together. There’s an English natural language discussion about what needs to happen, and as we all know, that never really goes terribly well. In automotive that becomes an integration storm, and it is a nightmare. With the new compute requirements that have been mentioned already, that just compounds the issue. So the way I see this is that we tend, as people who have an engineering background, to dive into how we’re going to do things. We hear ‘software-defined vehicle,’ we immediately think about how to do that. There’s not a lot of thought about why it needs to be done, and what needs to happen. We jump into the ‘how’ too early, and a lot of the discussion here is exemplary of that kind of approach. When I’m looking at software-defined vehicles, I’m looking at why it’s important that the software needs to run effectively on a piece of hardware. And for that hardware, why is it important for it to actually operate properly on the software? Then you can decide how to put together a new methodology that’s going to bring those things together. In the past, it’s been called hardware/software co-design. There have been attempts many times, and as has been mentioned, other industries have made this transition. What’s unique about automotive is that it’s not just one transition that needs to happen. It’s hundreds or thousands of transitions. The ecosystem needs to be turned upside down, which we’re seeing happen right now, and you need to bring all that together. It really is a methodology where you need the tooling, you need the processes, you need the thinking, you need the organizations to change so that they can make this transition in a realistic way. SDV is a huge transition. It is a way for the automotive industry to morph into something that has longevity and can meet customer expectations, which it really hasn’t met for some time now.

Serughetti: At the end of the day, if we look starting at the top from our perspective, SDV is a means to bring and enhance the car experience for the customer. That’s the end result that the OEMs look at, but they look at it from the perspective of how that improves the OEM efficiencies, and how that creates new business opportunities. The way we look at it, and what’s important, is the impact it has on the industry, the impact on the processes, on the methodologies, on the people, on the ecosystem, on the technology. It’s really a transformation of the automotive market that is going to fundamentally change how the industry moves forward and bring the OEM into a world in which they are really looking at how they become efficient in delivering cars, how they bring new features, but at the same time, how they evolve their business as well.

SE: As you’ve all described, SDV requires many inter-dependencies, and the entire ecosystem has to have an understanding of the ‘why,’ which should then lead back to laying out the plan for how to get there. Where does the ecosystem stand today in terms of realizing SDV?

Fritz: OEMs have decided in the last few years that they’ve got to take control of their own destiny. They cannot simply take what the suppliers provide. They need a methodology — like this whole SDV concept, and any tooling necessary to provide that — to push down into their suppliers, such that, ‘Here’s what I need. If you can’t do this for me, I will go find someone that will.’ This is not the old ecosystem that bubbled up from the IP to the Tier 2s, to the Tier 1s, and then to the OEMs, which gave them limited choices to go from. So when I say, “Turn the ecosystem upside down,” that’s what is happening. But every OEM has their own ecosystem, and they’re not all in the same place. Even region-to-region, they can be very different.

Delgado: This is a critical discussion, and effectively where the industry has to eventually settle. The magnitude of the transformation of the ecosystem includes roles in the technology evolution. The silicon content is expected to quadruple over the next few years in the vehicle for defining the in-cabin experience of the end user. At the end of the day, the complexity of the transition of roles is of such magnitude that the proprietary, fragmented, and broken approaches that David articulated are really not going to enable the industry to transform at the speed it requires to deliver and meet the experiences. But more than anything, they are not going to address the actual technology changes necessary to implement and allow for this value delivery mechanism. At the end of the day, this is where Intel really believes collaboration is key, and anybody who wants to participate in this ecosystem must provide scalability — also known as top-to-bottom support of the different product lines that our OEMs and Tier 1s are having to support, versus a broken-up approach on these ever-evolving higher performance and higher performance compute needs. It has to be future-proof, because you’re going to launch the vehicle eventually. So certain hardware has to be future-proofed to a certain affordability envelope, and there has to be a strategy around that. And then the ecosystem and that collaboration must be able to deliver that aggregation. It has to be done with certain anchoring technology that will allow us to deliver that performance. Collaboration is key in the sense that these technologies cannot be single-handedly owned, developed, let alone owned, defined, developed, and integrated by OEMs in silos with a proprietary end-to-end architecture definition. There obviously will be differentiations on the actual implementation, but the technologies at large have to have a sense of reuse, particularly from other verticals that have already done software-defined transformations and then tuned in the right ways toward the automotive requirements.

Spadoni: There are probably a wide variety of implementations. At Infineon, we partner with OEMs and Tier 1s and we see different approaches. For example, General Motors has more of a modular approach that emulates what happened in in the mobile phone space. It seems that Ford has a more pragmatic approach, along with Stellantis, but all of them are facing very similar challenges in that affordability has become a big problem. There are multiple generations of implementations that are going to occur, and you’ll see a striving toward how to pay for this extra hardware. It leads to tradeoffs in implementations of other systems that have to have savings in order for them to afford these vehicles. No one ever goes into a dealership and says, ‘Give me a software-defined vehicle.’ Everyone’s looking for value, and you can see it now with volumes going down. There’s a saturation of people buying at the high level. The OEMs want to get more sales, which means they’ll have to go to the lower-cost-value vehicles, and that’s going to affect the electrical and electronic architectures and the software-defined vehicle.

Clocher: What we’re seeing I would summarize as the impact on the ecosystem. We’re moving to an OEM-centric ecosystem. One size does not fit all, meaning OEMs will have their different tastes, their different definitions of levels of integration they want to have in their software-defined vehicle — especially given more complex tasks that we all have to do, rather than the challenge we have to solve, because we’re not talking about a common umbrella of software-defined vehicle. But it really does mean different implementations and different meanings for OEM A from OEM B. I would fully agree with David and Steve that we are far from having a common understanding of, at least, the market itself. And that’s fine, because this will bring differentiation, and ultimately that’s why a customer will go to Dealership A versus Dealership B. This is what the industry wants to see — continue to differentiate, continue to add value to the ultimate product, which is the car.

Serughetti: The important point in all this is, of course, you’re breaking the model that exists today. That’s one of the big challenges. We used to have Tier 1s that were building boxes, and delivering software. This was a complete black box. When it would go to integration, there were all sorts of problems. And now you’re going to break this? The challenge for the OEM is how they do this. They want to control software, but are they equipped to do this today? We see the problems today that some of the legacy OEMs have in setting up their software organizations, the challenges of CARIAD and all such organizations that are trying to do this. It’s not easy to change those companies. Of course, the new entrants don’t have this problem because they are coming from a brand new design versus the ones that deal with legacy. So for the OEM, it’s about how to take control of the software. What does that mean in terms of the processes, in terms of agile development, digital twins, and all of these technologies everybody’s talking about? The other side is, ‘It’s all nice, this software,’ but this software runs on all the companies that are delivering hardware, and that becomes essential to it. You can have the best software, but if your hardware is not there to support performance, power, and all of those aspects, you’re not going to be successful. So the ecosystem is evolving how hardware, software, and all of this comes together. The OEM wants to be the central point. That’s what we’re talking about in terms of the process methodology aspects that are making this transition evolve.

Gajendra: Where are we in this journey? How far have we come? And where are we going? Going back to the point that David mentioned earlier about supply chain evolving and the supply chain turned upside down, five years ago, if we sat here in this sort of a panel and discussed software-defined vehicles, the conversation would have been entirely different. It would have been stuck with the traditional supply chain that we’ve seen for the last 35 or 40 years in the automotive industry. There are fundamentally two aspects here. The supply chain is evolving, and the infrastructure that we, as a community — this team, for example, and many others in the community — are trying to enable is going to be key to making our EDA partners happy. The use of virtual platforms today in the cloud to try and shift left and develop and validate some of these technologies and software wasn’t even there five years ago, so we’ve come a long way. We’ve made a lot of progress together as an industry. Yes, we have a long way to go until we actually have a truly software-defined vehicle. We can go and ask for a software-defined vehicle in the dealership. But the changes we are seeing in terms of all sorts of technology providers trying to make sure that the technology that we eventually will have in the hardware is provided in some sort of virtual form, be it fast models or whatever it is in the cloud, for the vast majority of software ecosystem in automotive this is a big change. I was at Embedded World, and the amount of virtual platforms and the demos that people were actually showing — silicon partners like we have here, Intel, Renesas, Infineon, EDA companies — pointed to a strong movement of, ‘Let’s build the infrastructure that we can build, and then provide that infrastructure to the OEMs to take it from there.’ There is a lot of work going on. Together we will make the infrastructure across the board, be it virtual platform or others, richer and more capable.

Alpert: For sure, OEMs have to control their own destiny. In the past, they would do it by differentiating maybe because they had better engine performance, or some other feature. But going forward, the differentiation is going to be their software. Whoever can make software that will provide additional value, and brand it, that’s going to be the differentiator and that’s the trend. In terms of how you get there, a shared ecosystem is important. SOAFEE is a potential way that, together with virtual platforms, you can provide a shared ecosystem for development, but still allow everyone to differentiate and plug-and-play. That’s one reason we’re working closely with Arm on trying to have a reference design specifically for this purpose. But again, we’re not saying, ‘This is the design you use. This is how you do it.’ That’s not it. The point is, let’s start somewhere, and then people can start swapping out pieces and doing different things. As long as OEMs can plug-and-play, then they can still differentiate. But they don’t have to invent everything themselves, which would be too costly.

Related Reading
Software-Defined Vehicles Ready To Roll
New approach could have big effects on cost, safety, security, and time to market.

The post Software-Defined Vehicle Momentum Grows appeared first on Semiconductor Engineering.

Blog Review: May 1

Cadence’s Vatsal Patel stresses the importance of having testing and training capabilities for high-bandwidth memory to prevent the entire SoC from becoming useless and points to key HBM DRAM test instructions through IEEE 1500.

In a podcast, Siemens’ Stephen V. Chavez chats with Anaya Vardya of American Standard Circuits about the growing significance of high density interconnect and Ultra HDI technologies, which enable denser component placement and increased signal integrity compared to traditional PCB designs.

Synopsys’ Ian Land and Randy Fish find that silicon lifecycle management is increasingly being used on chips that target the aerospace and government market to ensure system health and longevity.

Arm’s Hristo Belchev looks at how to enable testing of system designs using the Memory Partitioning and Monitoring (MPAM) Arm architecture supplement, which allows privileged software to partition caches, memory controllers and interconnects on the hardware level.

Keysight’s Jonathon Wright considers where generative AI can add value in software testing by proposing a wide range of scenarios and improving communication between different stakeholders.

Ansys’ Laura Carter checks out how simulation is used to reduce the risks to drivers during a crash in stock car racing.

SEMI’s Maria Daniela Perez chats with Owen J. Guy of Swansea University about the challenge of onboarding talent within the microelectronics industry and the importance of ensuring students receive hands-on experience and exposure to real-world applications.

And don’t miss the blogs featured in the latest Systems & Design newsletter:

Technology Editor Brian Bailey suggests that although it is great to see the DAC conference come back to life, EDA companies need to do something about the show floor.

Siemens’ John Ferguson shows how to glean useful information well before all the details of an assembly are known.

Axiomise’ Ashish Darbari explains how formal verification can help improve chips.

Arteris’ Frank Schirrmeister tracks the race to centralized computing in automotive.

Synopsys’ Andrew Appleby explores the co-optimization of foundation IP and design flows for new transistors.

Cadence’s Anika Sunda looks at controlling the access to physical memory addresses.

Keysight’s Ben Coffin digs into how AI will be used in just about every subsystem of 6G networks.

The post Blog Review: May 1 appeared first on Semiconductor Engineering.

Increased Automotive Data Use Raises Privacy, Security Concerns

Od: John Koon

The amount of data being collected, processed, and stored in vehicles is exploding, and so is the value of that data. That raises questions that are still not fully answered about how that data will be used, by whom, and how it will be secured.

Automakers are competing based on the latest versions of advanced technologies such as ADAS, 5G, and V2X, but the ECUs, software-defined vehicles, and in-cabin monitoring also demand more and more data, and they are using that data for purposes that extend beyond just getting the vehicle from point A to point B safely. They now are vying to offer additional subscription-based services according to customers’ interests, as various entities, including insurance companies, indicate a willingness to pay for information on drivers’ habits.

Collecting this data can help OEMs gain insights and potentially generate additional revenue. However, gathering it raises privacy and security concerns about who will own this massive amount of data and how it should be managed and used. And as automotive data use increases, how will it impact future automotive design?

Fig. 1: Connected vehicles rely on software to communicate between vehicles and the cloud. Source: McKinsey & Co.

Fig. 1: Connected vehicles rely on software to communicate between vehicles and the cloud. Source: McKinsey & Co.

“Much of the data generated in the vehicle will have immense value to OEMs and their partners for analyzing driver behavior and vehicle performance and for developing new or enhanced features,” said Sven Kopacz, autonomous vehicle section manager at Keysight Technologies. “On the other hand, the privacy of data use can be viewed as a risk to some. But the real value – as already implemented and used by Tesla and others – is the constant feedback to improve those ADAS algorithms, enable a CI/CD DevOps software development model, and allow the rapid download of updates. Only time will tell if law enforcement and the courts will demand this data and how lawmakers will respond.”

Types of data generated
According to Precedence Research, the global automotive data market size will grow from $2.19 billion in 2022 to $14.29 billion by 2032, with many types of data collected, including:

  • Autonomous driving: Data on all levels, from L1 to L5, including that collected from the multiple sensors installed on vehicles.
  • Infrastructure: Remote monitoring, OTA updates, and data used for remote control by control centers, V2X, and traffic patterns.
  • Infotainment: Information on how customers are using applications, such as voice control, gesture, maps, and parking.
  • Connected information: Information on payment to third-party parking apps, accident information, data from dashboard cameras, handheld devices, mobile applications, and driver behavior monitoring.
  • Vehicle health: Repair and maintenance records, insurance underwriting, fuel consumption, telematics.

This information may be useful for future automotive design, predictive maintenance, and safety improvements, and insurance companies are expected to be able to reduce underwriting costs with more comprehensive information on accidents. Based on the information collected, OEMs should be able to design more reliable and safer cars, and to stay in close touch with customer wants. For example, experiments can be conducted to gauge customer demand for subscription-based services such as automatic parking and more sophisticated voice input and commands.

“Diagnostic data for service and repair has been a core of automotive data analytics for decades,” noted Lorin Kennedy, senior staff product management manager for SLM in-field analytics at Synopsys. “With the advent of connected vehicles and advanced machine learning (ML) analytics, which enable a greater quantity of data to be routinely processed, this data has gained exponentially in value. As data drives feature enhancements such as mobile-like experiences and advanced driver assist capabilities, OEMs increasingly need to better understand the dependability and reliability of the semiconductor systems powering these new features. The collection of monitoring and sensor data from electronic components and the semiconductors themselves will be a growing diagnostic data requirement across all types of automotive technologies like ADAS, IVI, ECUs, etc. to ensure quality and reliability on these more advanced nodes.”

Anticipated updates to ISO 26262 regulations regarding the application of predictive maintenance to hardware, identifying degrading intermittent faults caused by silicon aging, and over-stress conditions in the field are areas to be addressed, as well. Those can include silicon lifecycle management (SLM) technologies, which can deliver more comprehensive knowledge about the health and remaining useful life of silicon as it ages.

“That knowledge, in turn, will enable service updates and future OTA releases that leverage additional semiconductor compute power,” Kennedy said. “Overall fleet performance will benefit, and the semiconductor and system design process will, too, as new insights help achieve greater efficiencies. OEM, Tier One, and semiconductor supplier collaboration on what the data brings to light – from silicon to software system performance – will enable vehicles to meet the functional safety design parameters that are becoming increasingly crucial in advanced electronics.”

Still, for data generated in vehicles, OEMs will need to prioritize which data can provide value for drivers immediately, and which data should be sent to the cloud via 5G connections.

“Tradeoffs between on-board processing to reduce data volume and data transmission network costs will likely dictate prioritization,” Keysight’s Kopacz said. “For example, camera, lidar, and radar sensor data for ADAS applications may have value for training ADAS algorithms, but the volume of raw data will be very costly to transmit and store. Likewise, driver attention data can have high value in UI design, and would be best gathered in a meta-data form. V2X data has a relatively lower data volume and should ultimately be a key data source for ADAS, providing in-car non-line-of-sight visibility of other vehicles, road infrastructure, and road conditions. Sharing this over V2N links can enable effective safety applications, but angle random walk (ARW) sensor data needs to be considered more carefully due to its complex nature. Infotainment streaming content into the vehicle also can be a valuable revenue stream for OEMs, and the content providers as well, as network operators working together.”

Impacts on automotive cybersecurity
As vehicles become more autonomous and connected, data use will increase, and so will the value of that data. This raises cybersecurity and data privacy concerns. Hackers want to steal personal data collected by the vehicles, and can use ransomware and other attacks to do so. The idea of taking control of vehicles — or worse, stealing them — also attracts hackers. Techniques used include hacking vehicle apps and wireless connections on the vehicles (diagnostics, key fob attacks and keyless jamming). Protecting data access, vehicles, and infrastructure from attacks is increasingly important and challenging.

Cybersecurity risks increase with software-defined vehicles. Memory especially will need to be safeguarded.

“The integration of advanced technology into EVs poses significant cybersecurity challenges that demand immediate attention and sophisticated solutions,” said Ilia Stolov, center head of secure memory solution at Winbond. “Central to the digital fortresses within modern electronic platforms are flash non-volatile memories, housing invaluable assets like code, private data, and company credentials. Unfortunately, their ubiquity has rendered them attractive targets for hackers seeking unauthorized access to sensitive information.”

Stolov noted that Winbond has been actively working to secure flash memory from hacks.

Additionally, there are important considerations in securing memory designs, such as:

  • DICE root of trust: The Device Identifier Composition Engine (DICE) should be used to create the secure flash root of trust for hardware security. This secure identity forms the basis for building trust in the hardware. Other security measures can therefore rely on the authenticity and integrity of the boot code, protecting against firmware and software attacks. The initial boot process and subsequent software execution are based on trusted and verified measurements, helping prevent the injection of malicious code into the system.
  • Code and data protection: Protecting code and data is crucial for maintaining system-wide integrity. Unauthorized modifications to code or data can lead to malfunctions, system instability, or the introduction of malicious code, compromising the hardware’s intended functionality or exploiting system vulnerabilities.
  • Authentication protocols: Authentication is a fundamental and crucial component of cybersecurity, serving as the frontline defense against unauthorized access and potential security breaches. Employing authentication protocols to restrict access to authorized actors and approved software layers only using cryptography credentials is important.
  • Secure software updates with rollback protection: Regular updates extend beyond bug fixes including remote firmware over-the-air (OTA) updates, guards against rollback attacks, and ensures the execution of only legitimate updates.
  • Post-quantum cryptography: Anticipating the post-quantum computing era to include NIST 800-208 Leighton-Micali Signature (LMS) cryptography safeguards EVs against the potential threats posed by future quantum computers.
  • Platform resiliency: Automatic detection of unauthorized code changes enables swift recovery to a secure state, effectively thwarting potential cyber threats. Adhering to NIST 800-193 recommendations for platform resiliency ensures a robust defense mechanism.
  • Secure supply chain: Guaranteeing the origin and integrity of flash content throughout the supply chain, these secure flash devices prevent content tampering and misconfiguration during platform assembly, transportation, and configuration. This, in turn, safeguards against cyber adversaries.

Considering the transition to SDVs and connected cars, data vulnerability becomes even more significant.

“Depending on where data resides, different protection measures are in place,” said Keysight’s Kopacz. “Intrusion detection systems (IDS), crypto services, and key management are becoming standard solutions in vehicles. Especially sensitive data for safety features needs to be protected and verified. Thus, redundancy becomes more relevant. With SDVs, the vehicle software is constantly updated or changed throughout the entire vehicle life cycle. Ever-evolving cyber threats are particularly challenging. Accordingly, the entire vehicle software must be continuously checked for new security gaps. OEMs are going to need comprehensive testing solutions to minimize security threats. This will need to include the cybersecurity testing of the entire attack surface, covering all vehicle interfaces – wired vehicle communication networks such as CAN or automotive Ethernet or wireless connections via Wi-Fi, Bluetooth, or cellular communications. OEMs will also need to test the backend that provides over-the-air (OTA) software updates. Such solutions can reduce the risk of damage or data theft by cybercriminals.”

Data management and privacy concerns
Another issue to be resolved is how the massive amount of data collected will be managed and used. Ideally, data will be analyzed to yield commercial value without causing privacy concerns. For example, infotainment platform data might reveal what types of music are most popular, helping the music industry to improve marketing strategies. Who will monitor the transfer of such data, though? How will customers be made aware of the data collection? And will they have an opportunity to opt out of having their data sold?

As with airplanes, vehicle black boxes are installed to record information for analysis of the data after an accident occurs. The information recorded includes vehicle speed, the braking situation, and the activation of air bags, among other things. If an accident occurs resulting in a fatality, and the data from ADAS and ECU uncover vulnerability in the designs, could that data be used as evidence in court against manufacturers or their supply chains? Armed with this information, the insurance industry may decline claims. Would one or more manufacturers of the ADAS/ECU be required to hand over the data when ordered by the authorities?

“Quality requirements for sophisticated electronic parts will continue to become more rigid and strict, allowing only a few defective parts per billion (DPPB) due to the impact failed components can have on the safety and well-being of human life,” noted Guy Cortez, senior staff product management manager for SLM analytics at Synopsys. “SLM data analytics will continue to play a substantial role in the health, maintainability, and sustainability of these devices throughout their life within the vehicle. Through the power of analytics, you can do proper root cause analysis of any failed device (e.g., return merchandise authorization, or RMA). What’s more, you will also be able to find ‘like’ devices that ultimately may exhibit similar failed behavior over time. Thus empowered, you can proactively recall these like devices before they fail during operation in the field. Upon further analysis, the device(s) in question may require a design re-spin by the device developer in order to correct any identified issue. With a proper SLM solution deployed throughout the automotive ecosystem, you can achieve a higher level of predictability, and thus higher quality and safety for the automotive manufacturer and consumer.”

OEM impact
While modern cars have been described as computers on wheels, they are now more like mobile phones on wheels. OEMs are designing cars that do not skimp on features. Semi-autonomous driving, voice-controlled infotainment systems, and the monitoring of many functions—including driver behavior— are yielding a large amount of data. While that data can be used to improve future designs. OEMs’ approaches to security and privacy vary, with some offering stronger security and privacy protection than others.

Mercedes-Benz is paying attention to data security and privacy, and is compliant to UN ECE R155 / R156, a European norm for cybersecurity and software update management systems, according to the company. Which data is processed in connection with digital vehicle services depends on which services the customer selects. Only the data required for the respective service will be processed. Additionally, the “Mercedes me connect” app’s terms of use and privacy information make it transparent for customers to see what data is needed for and how it is processed. Customers can determine which services they want to use.

Hyundai indicated it would follow a user-centric focus, prioritizing safety, information security, and data privacy with fault-tolerant software architectures to enhance cybersecurity. Hyundai Motor Group’s global software center, 42dot, is currently developing integrated hardware/software security solutions that detect and block data tampering, hacking, and external cyber threats, as well as abnormal communication using big data and AI algorithms.

And according to the BMW Group, the company manages a connected fleet of more than 20 million vehicles globally. More than 6 million vehicles are updated over-the-air on a regular basis. Together with other services, more than 110 terabytes of data traffic per day are processed between the connected vehicles and cloud-backend. All BMW vehicle interfaces permit consumers to opt in or out of various types of data collection and processing that may happen on their vehicles. If preferred, BMW customers may opt out of all optional data collection relating to their vehicles at any time by visiting the BMW iDrive screen in their vehicle. Additionally, to completely stop the transfer of any data from BMW vehicles to BMW services, customers can contact the company to request that the embedded SIM on their vehicles be disabled.

Not all OEMs hold the same philosophy on privacy. According to a study on 25 brands conducted by the Mozilla Foundation, a nonprofit organization, 56% will share data with law enforcement in response to an informal request, 84% share or sell personal data, and 100% earned the foundation’s “privacy not included” warning label.

More importantly, are customers educated or informed on the privacy issue?

Fig. 2: Once data is collected from a vehicle, it can go to multiple destinations without the knowledge of customers. Source: Mozilla, *Privacy Not Included.

Fig. 2: Once data is collected from a vehicle, it can go to multiple destinations without the knowledge of customers. Source: Mozilla, *Privacy Not Included.

Applying data to automotive design in the future
OEMs collect many different types of automotive data in relation to autonomous driving, infrastructure, infotainment, connected vehicles, and vehicle health and maintenance. The ultimate goal, however, is not just to compile massive raw data; rather, it is to extract value from it. One of the questions OEMs need to ask is how to apply technology to extract information that is really useful in future automotive design.

“OEMs are trying to test and validate the various functions of their vehicles,” said David Fritz, vice president of virtual and hybrid systems at Siemens EDA. “This can involve millions of terabytes of data. Sometimes, a huge portion of the data is redundant and useless. The real value in the data is, once it gets distilled, that it’s in a form where humans can relate to the meaning of the data, and it also can be pushed into the systems while they’re being developed and tested and before the vehicles are even on the ground. We’ve known for quite some time that many countries and regulatory bodies around the world have been collecting what they call an accident database. When an accident occurs, the police show up on the scene collecting relevant data. ‘There was an intersection here, a stop sign there. And this car was traveling in this direction roughly this many miles an hour. The weather condition is this. The car entered the intersection in the yellow light and caused an accident, etc.’ This is an accident scenario. Technologies are available to take those scenarios and put them in a standard form called Open Scenario. Based on the information, a new set of data can be generated to determine what the sensors would be seeing in those accident situations, and then push it through both a virtual version of the vehicle and environment and in the future, and push those scenarios through the sensors in this physical vehicle itself. This is really the distillation of that data into a form that a human can wrap their mind around. Otherwise, you could collect billions of terabytes of raw data and try to push that into these systems, and it wouldn’t actually help you any more than if someone were sitting in a car and dragging those for billions of miles.”

But that data also can be very useful. “If an OEM wants to obtain safety certification, say in Germany, the OEM can provide a set of data of scenarios on how the vehicle will navigate,” Fritz said. “An OEM can provide a set of data to the German authority, with a set of scenarios to prove the vehicle will navigate in a safe manner under various conditions. By comparing that with the data in the accident database, the German government can say that as long as you avoid 95% of the accidents in that database, you’re certified. That’s actionable from the perspectives of human drivers, insurance, engineering, and visual simulation. The data prove the vehicle is going to behave as expected. The alternative is to drive around, as in the case of autonomous vehicles, and try to justify the accident was not caused by the vehicle, while facing the lawsuit. It does not seem to make sense, but that’s what’s happening today.”

Related Reading
Curbing Automotive Cybersecurity Attacks
A growing number of standards and regulations within the automotive ecosystem promises to save developments costs by fending off cyberattacks.
Software-Defined Vehicles Ready To Roll
New approach could have big effects on cost, safety, security, and time to market.

The post Increased Automotive Data Use Raises Privacy, Security Concerns appeared first on Semiconductor Engineering.

Interoperability And Automation Yield A Scalable And Efficient Safety Workflow

By Ann Keffer, Arun Gogineni, and James Kim

Cars deploying ADAS and AV features rely on complex digital and analog systems to perform critical real-time applications. The large number of faults that need to be tested in these modern automotive designs make performing safety verification using a single technology impractical.

Yet, developing an optimized safety methodology with specific fault lists automatically targeted for simulation, emulation and formal is challenging. Another challenge is consolidating fault resolution results from various fault injection runs for final metric computation.

The good news is that interoperability of fault injection engines, optimization techniques, and an automated flow can effectively reduce overall execution time to quickly close-the-loop from safety analysis to safety certification.

Figure 1 shows some of the optimization techniques in a safety flow. Advanced methodologies such as safety analysis for optimization and fault pruning, concurrent fault simulation, fault emulation, and formal based analysis can be deployed to validate the safety requirements for an automotive SoC.

Fig. 1: Fault list optimization techniques.

Proof of concept: an automotive SoC

Using an SoC level test case, we will demonstrate how this automated, multi-engine flow handles the large number of faults that need to be tested in advanced automotive designs. The SoC design we used in this test case had approximately three million gates. First, we used both simulation and emulation fault injection engines to efficiently complete the fault campaigns for final metrics. Then we performed formal analysis as part of finishing the overall fault injection.

Fig. 2: Automotive SoC top-level block diagram.

Figure 3 is a representation of the safety island block from figure 2. The color-coded areas show where simulation, emulation, and formal engines were used for fault injection and fault classification.

Fig. 3: Detailed safety island block diagram.

Fault injection using simulation was too time and resource consuming for the CPU core and cache memory blocks. Those blocks were targeted for fault injection with an emulation engine for efficiency. The CPU core is protected by a software test library (STL) and the cache memory is protected by ECC. The bus interface requires end-to-end protection where fault injection with simulation was determined to be efficient. The fault management unit was not part of this experiment. Fault injection for the fault management unit will be completed using formal technology as a next step.

Table 1 shows the register count for the blocks in the safety island.

Table 1: Block register count.

The fault lists generated for each of these blocks were optimized to focus on the safety critical nodes which have safety mechanisms/protection.

SafetyScope, a safety analysis tool, was run to create the fault lists for the FMs for both the Veloce Fault App (fault emulator) and the fault simulator and wrote the fault lists to the functional safety (FuSa) database.

For the CPU and cache memory blocks, the emulator inputs the synthesized blocks and fault injection/fault detection nets (FIN/FDN). Next, it executed the stimulus and captured the states of all the FDNs. The states were saved and used as a “gold” reference for comparison against fault inject runs. For each fault listed in the optimized fault list, the faulty behavior was emulated, and the FDNs were compared against the reference values generated during the golden run, and the results were classified and updated in the fault database with attributes.

Fig. 4: CPU cluster. (Source from https://developer.arm.com/Processors/Cortex-R52)

For each of the sub parts shown in the block diagram, we generated an optimized fault list using the analysis engine. The fault lists are saved into individual session in the FuSa database. We used the statistical random sampling on the overall faults to generate the random sample from the FuSa database.

Now let’s look at what happens when we take one random sample all the way through the fault injection using emulation. However, for this to completely close on the fault injection, we processed N samples.

Table 2: Detected faults by safety mechanisms.

Table 3 shows that the overall fault distribution for total faults is in line with the fault distribution of the random sampled faults. The table further captures the total detected faults of 3125 out of 4782 total faults. We were also able model the detected faults per sub part and provide an overall detected fault ratio of 65.35%. Based on the faults in the random sample and our coverage goal of 90%, we calculated that the margin of error (MOE) is ±1.19%.

Table 3: Results of fault injection in CPU and cache memory.

The total detected (observed + unobserved) 3125 faults provide a clear fault classification. The undetected observed also provide a clear classification for Residual faults. We did further analysis of undetected unobserved and not injected faults.

Table 4: Fault classification after fault injection.

We used many debug techniques to analyze the 616 Undetected Unobserved faults. First, we used formal analysis to check the cone of influence (COI) of these UU faults. The faults which were outside the COI were deemed safe, and there were five faults which were further dropped from analysis. For the faults which were inside the COI, we used engineering judgment with justification of various configurations like, ECC, timer, flash mem related etc. Finally, using formal and engineering judgment we were able to further classify 616 UU faults into safe faults and remaining UU faults into conservatively residual faults. We also reviewed the 79 residual faults and were able to classify 10 faults into safe faults. The not injected faults were also tested against the simulation model to check if any further stimulus is able to inject those faults. Since no stimulus was able to inject these faults, we decided to drop these faults from our consideration and against the margin of error accordingly. With this change our new MOE is ±1.293%.

In parallel, the fault simulator pulled the optimized fault lists for the failure modes of the bus block and ran fault simulations using stimulus from functional verification. The initial set of stimuli didn’t provide enough coverage, so higher quality stimuli (test vectors) were prepared, and additional fault campaigns were run on the new stimuli. All the fault classifications were written into the FuSa database. All runs were parallel and concurrent for overall efficiency and high performance.

Safety analysis using SafetyScope helped to provide more accuracy and reduce the iteration of fault simulation. CPU and cache mem after emulation on various tests resulted an overall SPFM of over 90% as shown in Table 5.

Table 5: Overall results.

At this time not all the tests for BUS block (end to end protection) doing the fault simulation had been completed. Table 6 shows the first initial test was able to resolve the 9.8% faults very quickly.

Table 6: Percentage of detected faults for BUS block by E2E SM.

We are integrating more tests which have high traffic on the BUS to mimic the runtime operation state of the SoC. The results of these independent fault injections (simulation and emulation) were combined for calculating the final metrics on the above blocks, with the results shown in Table 7.

Table 7: Final fault classification post analysis.

Conclusion

In this article we shared the details of a new functional safety methodology used in an SoC level automotive test case, and we showed how our methodology produces a scalable, efficient safety workflow using optimization techniques for fault injection using formal, simulation, and emulation verification engines. Performing safety analysis prior to running the fault injection was very critical and time saving. Therefore, the interoperability for using multiple engines and reading the results from a common FuSa database is necessary for a project of this scale.

For more information on this highly effective functional safety flow for ADAS and AV automotive designs, please download the Siemens EDA whitepaper Complex safety mechanisms require interoperability and automation for validation and metric closure.

Arun Gogineni is an engineering manager and architect for IC functional safety at Siemens EDA.

James Kim is a technical leader at Siemens EDA.

The post Interoperability And Automation Yield A Scalable And Efficient Safety Workflow appeared first on Semiconductor Engineering.

V2X Path To Deployment Still Murky

Experts at the Table: Semiconductor Engineering sat down to discuss Vehicle-To-Everything (V2X) technology and the path to deployment, with Shawn Carpenter, program director for 5G and space at Ansys; Lang Lin, principal product manager at Ansys; Daniel Dalpiaz, senior manager product marketing, Americas, green industrial power division at Infineon; David Fritz, vice president of virtual and hybrid systems at Siemens EDA; and Ron DiGiuseppe, senior marketing manager, automotive IP segment at Synopsys. What follows are excerpts from that conversation.

L-R: Ansys' Carpenter; Ansys' Lin; Infineon’s Dalpiaz; Siemens EDA’s Fritz; Synopsys‘ DiGiuseppe.

L-R: Ansys’ Carpenter; Ansys’ Lin; Infineon’s Dalpiaz; Siemens EDA’s Fritz; Synopsys‘ DiGiuseppe.

SE: What is the potential of vehicle-to-everything technology, and what role will the semiconductor ecosystem play in making this a reality?

DiGiuseppe: V2X is a technology that’s not just years, but decades, in the making. It initially started as a dedicated short-range communications (DSRC) type of technology, and has globally transitioned into a cellular technology, although many of those V2X applications are not just cellular. There are other spectrum allocations V2X can run on, including WiFi or other general-use technology. So it’s not limited to cellular. Also, it’s not just a technology. It’s an application, an outcome, and there are a lot of valuable uses, many of which are safety-related, but there are others, such as efficiency of traffic management notifications. V2X has a wide number of uses. The deployment will be done in stages, and there’s a lot of activity even though it’s taken a long time.

Lin: When I see the keyword V2X, it reminds me of everything about how the car can communicate with anything in the world. It’s a very exciting moment that we’re here today to be able to make some kind of technology to enable great communication between vehicles and people, in network infrastructures and car to car communications. Today, there is already something implemented. For instance, in car network systems we can connect our phone to the car already, but we’re still in the first mile. We’ve started on the journey, but we have a long way to go as far as how to connect car-to-car, how to connect the car to the entire infrastructure of networks, and to the internet. There are a lot of unknowns on the road while we start driving on this journey, and safety and security are definitely the biggest concerns. What if my network is being jeopardized?

Dalpiaz: V2X is part of a much bigger smart grid ecosystem. This will certainly play a very important role, especially as the grid becomes smart and decentralized. This is what will enable the future energy ecosystem, having renewable energies, energy storage systems all connected. And as we see more EVs being used as mobile battery storage. this is something that will certainly enable, and is part of, a smart grid ecosystem that everybody’s talking about.

Fritz: The days of independent semiconductor and software development are over. It is the need for OEMs to control their own destiny, driven by growing consumer and competitive demand, that has all but eliminated the ability to sell a one-size-fits-all product. We’ve known for a very long time that software needs to drive semiconductors, and semiconductors need to drive software. This symbiotic relationship, and the tools and methodologies needed to support this paradigm shift, are essential to producing a highly successful, complex, and competitive solution that meets consumer demands.

SE: What are the discrete pieces of V2X that need to be connected?

Dalpiaz: From the semiconductor point of view, especially with the usage of wide bandgap materials, a few companies are seeing that it’s possible to increase efficiency and power density. Being able to not only provide such solutions, but have everything connected in one box, is part of the smart ecosystem. Then, having the electric vehicles, energy storage, solar — everything combined into one box. Twenty years ago, before the iPhone, we used to have a fax machine, a camera for photographs, a computer. The future of this ecosystem is going to have one box sitting in your home, and have all this stuff connected together. So from the semiconductor point of view, especially with silicon carbide, it is something that is possible today, and it can achieve a very high level of efficiency — about 99%, very close to 100%. And of course, we need to make the system smaller to fit in a vehicle.

DiGiuseppe: One of the key stakeholders is the cellular companies. When we look at cellular V2X, one of the main challenges is interoperability. You have different devices in different model-year cars, so for the vehicle-to-vehicle communications, those different devices need to be interoperable. Then, the car will be talking to the infrastructure, so the roadside units need to be interoperable with the cars and devices in the cars. Then, of course, you have vehicle-to-pedestrians, vehicle-to-e-mobility like vehicle-to-bicycles, vehicle-to-motorcycles interoperability between all the devices over the medium. Whether it’s cellular or Wi-Fi or other technologies, it all needs to be interoperable. That will allow deployments in one locality to work in another locality, because even if they’re interoperable in one deployment in one region, we’ve got to make sure they’re also interoperable in other regions. So it’s a large scale interoperability goal.

Lin: Ron, you’re talking about interoperability, and Daniel talked about the ecosystem. From my side, I would also mention some standards are necessary. For EDA, to help build such an ecosystem and chips, we need some rules to give to engineers as to what’s to be followed. There are two important standards in my mind. One is the vehicle safety standard ISO 26262, which regulates a couple of safety standards for on-road vehicle chip design. Another is the cyber security standard, ISO 21434. If I make a tool, I probably will follow those standards, and then think about how the tool could help users decide a pass/failure criteria regarding their design, making sure to meet the security and safety target from the standard.

DiGiuseppe: In addition to standards, last October the U.S. Department of Transportation released its national V2X deployment plan. That plan, which is still in draft feedback stage, lays out — at least in the U.S. — the whole timeline for deployments. That kind of oversight plan overlays onto the standards that Lang was just talking about. That deployment plan outlines the different contributions from all the different stakeholders, from the automakers/OEMs to the software developers for the applications. So overlaid on top of standards is a deployment plan, and a government deployment plan outlines that. Plus, there are a lot of government stakeholders, like the FCC allocating spectrum, and the Department of Transportation deploying all these deployments, and that’s in addition to the technology providers.

Fritz: It would take days to adequately answer those questions, but at the core, the root design components are connectivity, power, performance, and acceleration. Connectivity with the proper protocols allows computational tasks to be distributed. This is particularly important in automotive, where the physical distance between sensing, actuating and computing nodes is critical for predictable performance. In the case of V2X, connectivity enables the normalization of external data, whether it involves smart city infrastructure or another vehicle. It’s important to note that the form of the shared data grows exponentially with the capacity to describe the environment, and therefore the compute requirements to process and understand it. For example, a data form that can describe signage in the U.S. is relatively small, but one that is universal with variations recognizable is much larger and more ambiguous. This drives design parameters that directly impact manufacturing, development, and service cost functions. Further, the normalization of the data has an impact on the overall design and design component interactions. In the case of power, it goes without saying that high compute requirements, and the associated necessary cooling, can have a significant impact on EV range and manufacturing costs. Performance can take many forms, but as software loads increase with hypervisors, specialized operating systems, and protocol stacks, not to mention very complex application software, all must meet stringent mission critical requirements. Finally, acceleration is of growing importance because it allows workloads to be handed off to specialized hardware that is better equipped to handle that load. An example is running AI inferencing on a CPU is typically far slower and more power-hungry than on an NPU, but a GPU could be idle and available to do the same task. On the other hand, a small CNN can be handled quite easily on a CPU with a few simple instructions. It is at the intersection of these major design components where an OEM will find its differentiation. Therefore, having a system capable of exploring this complex hardware and software space quickly, and with a small team, is critical for an OEM to demand of its suppliers what is required for the success of its platforms. Again, controlling your own destiny is essential to survival.

SE: With all of this interoperability, what happens when there are parts of the ‘everything’ — whether it’s the car or the infrastructure or pedestrians — that are not updated with the latest technologies or different aspects of what needs to be there for conductivity?

DiGiuseppe: In addition to that challenge, this includes backward compatibility for automotive. For someone buying a car in 2025, you would expect any V2X technology to work in 2040. But in the meantime, all those standards that we’re talking about are continuing to evolve, so they need to be backward compatible.

Carpenter: This highlights the need for a digital twin capability for modeling this infrastructure to be able to understand that when we get two years down the road, some devices may not be reprogrammable. We may not be able to flash a particular device. We need to be able to look at that, and be able to simulate that in advance to understand what will happen. What will this do? We’re seeing this show a little bit, even giving a nod to what Ron was talking about earlier with interoperability. We have customers who want to be able to validate real hardware stuff that they’re developing on the lab bench, but they want to do it with the fidelity of a real system operating on a car, in a virtual city, with the live interaction of the channel with a gNodeB 5G base station mounted up on a building someplace, and they want to know how this will work in the context of the situation that it’s supposed to serve. And if something goes wrong in that scene, can we introduce something into this device and run our real silicon development platform against it to understand what happens here. If we go into a deep shadow, a deep fade area, and I’m not getting updates, yet I’m hurtling down the road at a certain speed, how long can I do this before I receive corrective information? What if someone’s software deck out there doesn’t get reprogrammed or doesn’t get the latest version of the standard safety protocols or something like that? We’re going to need this ability to carry models of stuff that was built two or three years ago in today’s infrastructure, model that, and understand in advance what’s going to happen with it so that we have an approach to do this. This is what the Department of Defense is doing today with their digital thread enablement, to have a way to capture that with legacy models of what they built years ago, but apply it in modern missions and understand, ‘Does it work? Does it fit? Does it not fit? What do we need to do to the existing system to make sure that we’re safe here?’ That is an approach we clearly see the automakers beginning to look at as a way to future-proof some of these systems and make sure that they’ve got a way to test them as they go forward.

Fritz: It’s become very clear from several popularized incidents that simply stopping and waiting for tech support to find you and get you going again is not going to be a successful strategy. In the end, the vehicle must make decisions at least as thoughtful as an average human would make. This is entirely possible, but not if too much emphasis is placed in the design phase on the dependencies between communicating (or non-communicating) actors. For this reason, we will always require sophisticated decision-making in-vehicle to be widely accepted.

SE: How does the design team stay up to date with everything?

DiGiuseppe: On the vehicle side, they’re going to be relying on over-the-air (OTA) software updates, which is relatively new in the automotive industry. But clearly, once we identify a software update, we’re going to need to roll out that software update, and OTA is obviously going to be used hand-in-hand with the updates to V2X as it moves forward.

SE: From a developer standpoint, they have to design to these all these regulations. What are the issues here?

Lin: As a software developer, if you think about a vehicle 10 years ago, you mainly just replaced hardware. You replaced your brakes, you replaced your engine, adding some fluid. These are all old styles. Right now, if you have the V2X network, you’d expect probably daily updates because software is evolving daily, and your whole communication system infrastructure is under the whole internet evolution, so you’re going to have to keep pace with it. That’s a lot of work for developers.

Carpenter: There could be implications on edge processing. The telecommunications providers are going to need to put a lot more compute closer to the radio head, and clearly they’re already exploring the possibilities of getting not just central processing cores, CPU cores, but there will be GPU cores and Tensor Processing Units, and we don’t know what all yet for AI, that will be a part of this safety infrastructure and information/infotainment delivery. There’s a lot more compute that’s going to have to happen with a much shorter latency. Augmented reality with heads-up displays — imagine the possibilities coming in safety systems with heads-up displays in cars. Then imagine the amount of processing that it’s going to take. So the telecom providers will need to be a major part of that, together with most of the local government regulatory groups that are going to foster that safety system. Each municipality probably has to decide what do they adopt, what level of standard will they use, and deliver. Who invests in that? The future is really exciting, but there are a few things yet to be sorted out in terms of the investment needed to really deliver that promise.

Dalpiaz: I’m more in the infrastructure side, and one of the questions we always have is, ‘With all this focus on renewables and decentralization of the grid, can the grid handle such expectations or such projects?’ Having more people connecting and feeding energy back into the grid, and managing all of this, that’s always the question that you have to go through and consider.

Fritz: The fact is that keeping up to date is not practical. However, that doesn’t mean that a methodology cannot be employed to accept changes into the development system, and therefore be folded into the development process. CI/CD systems with digital twin golden models already are being developed, with nightly regressions run against complex (and possibly changing) requirements. In this way, requirement changes are automatically addressed as they occur, and solutions can be rolled into an Agile methodology through nightly regressions. This is an important benefit of a modern development methodology that has been used in other industries for years, but it’s just now finding purchase in progressive automotive companies.

Related Reading
Growing Challenges For Increasingly Connected Vehicles
OEMs have high expectations for connected vehicles and global growth opportunities, but it’s not that simple.
Software-Defined Vehicles Ready To Roll
New approach could have big effects on cost, safety, security, and time to market.

The post V2X Path To Deployment Still Murky appeared first on Semiconductor Engineering.

2.5D Integration: Big Chip Or Small PCB?

Defining whether a 2.5D device is a printed circuit board shrunk down to fit into a package, or is a chip that extends beyond the limits of a single die, may seem like hair-splitting semantics, but it can have significant consequences for the overall success of a design.

Planar chips always have been limited by size of the reticle, which is about 858mm2. Beyond that, yield issues make the silicon uneconomical. For years, that has limited the number of features that could be crammed onto a planar substrate. Any additional features would need to be designed into additional chips and connected with a printed circuit board (PCB).

The advent of 2.5D packaging technology has opened up a whole new axis for expansion, allowing multiple chiplets to be interconnected inside an advanced package. But the starting point for this packaged design can have a big impact on how the various components are assembled, who is involved, and which tools are deployed and when.

There are several reasons why 2.5D is gaining ground today. One is cost. “If you can build smaller chips, or chiplets, and those chiplets have been designed and optimized to be integrated into a package, it can make the whole thing smaller,” says Tony Mastroianni, advanced packaging solutions director at Siemens Digital Industries Software. “And because the yield is much higher, that has a dramatic impact on cost. Rather than having 50% or below yield for die-sized chips, you can get that up into the 90% range.”

Interconnecting chips using a PCB also limits performance. “Historically, we had chips packaged separately, put on the PCB, and connected with some routing,” says Ramin Farjadrad, CEO and co-founder of Eliyan. “The problems people started to face were twofold. One was that the bandwidth between these chips was limited by going through the PCB, and then a limited number of balls on the package limited the connectivity between these chips.”

The key difference with 2.5D compared to a PCB is that 2.5D uses chip dimensions. There are much finer-grain wires, and various components can be packed much closer together on an interposer or in a package than on a board. For those reasons, wires can be shorter, there can be more of them, and bandwidth is increased.

That impacts performance at multiple levels. “Since they are so close, you don’t have the long transport RC or LC delays, so it’s much faster,” says Siemens’ Mastroianni. “You don’t need big drivers on a chip to drive long traces over the board, so you have lower power. You get orders of magnitude better performance — and lower power. A common metric is to talk about pico joules per bit. The amount of energy it takes to move bits makes 2.5D compelling.”

Still, the mindset affects the initial design concept, and that has repercussions throughout the flow. “If you talk to a die designer, they’re probably going to say that it is just a big chip,” says John Park, product management group director in the Custom IC & PCB Group at Cadence. “But if you talk to a package designer, or a board designer, they’re going to say it’s basically a tiny PCB.”

Who is right? “The internal organizational structure within the company often decides how this is approached,” says Marc Swinnen, director of product marketing at Ansys. “Longer term, you want to make sure that your company is structured to match the physics and not try to match the physics to your company.”

What is clear is that nothing is certain. “The digital world was very regular in that every two years we got a new node that was half size,” says Cadence’s Park. “There would be some new requirements, but it was very evolutionary. Packaging is the Wild West. We might get 8 new packaging technologies this year, 3 next year, 12 the next year. Many of these are coming from the foundries, whereas it used to be just from the outsourced semiconductor assembly and test companies (OSATs) and the substrate providers. While the foundries are a new entrant, the OSATs are offering some really interesting packaging technologies at a lower cost.”

Part of the reason for this is that different groups of people have different requirement sets. “The government and the military see the primary benefits as heterogeneous integration capabilities,” says Ansys’ Swinnen. “They are not pushing the edge of processing technology. Instead, they are designing things like monolithic microwave integrated circuits (MMICs), where they need waveguides for very high-speed signals. They approach it from a packaging assembly point of view. Conversely, the high-performance compute (HPC) companies approach it from a pile of 5nm and 3nm chips with high performance high-bandwidth memory (HBM). They see it as a silicon assembly problem. The benefit they see is the flexibility of the architecture, where they can throw in cores and interfaces and create products for specific markets without having to redesign each chiplet. They see flexibility as the benefit. Military sees heterogeneous integration as the benefit.”

Materials
There are several materials used as the substrate in 2.5D packaging technology, each of which has different tradeoffs in terms of cost, density, and bandwidth, along with each having a selection of different physical issues that must be overcome. One of the primary points of differentiation is the bump pitch, as shown in figure 1.

Fig 1. Chiplet interconnection for various substrate configurations. Source: Eliyan

Fig 1. Chiplet interconnection for various substrate configurations. Source: Eliyan

When talking about an interposer, it generally is considered to be silicon. “The interposer could be a large piece of silicon (Fig 1 top), or just silicon bridges between the chips (Fig 1 middle) to provide the connectivity,” says Eliyan’s Farjadrad. “Both of these solutions use micro-bumps, which have high density. Interposers and bridges provide a lot of high-density bumps and traces, and that gives you bandwidth. If you utilize 1,000 wires each running at 5Gb, you get 5Tb. If you have 10,000, you get 50Tb. But those signals cannot go more than two or three millimeters. Alternatively, if you avoid the silicon interposer and you stay with an organic package (Fig 1 bottom), such as flip chip package, the density of the traces is 5X to 10X less. However, the thickness of the wires can be 5X to 10X more. That’s a significant advantage, because the resistance of the wires will go down by the square of the thickness of the wires. The cross section of that wire goes up by the square of that wire, so the resistance comes down significantly. If it’s 5X less density, that means you can run signals almost 25X further.”

For some people, it is all about bandwidth per millimeter. “If you have a parallel bus, or a parallel interface that is high speed, and you want bandwidth per millimeter, then you would probably pick a silicon interposer,” says Kent Stahn, senior manager of hardware engineering in Synopsys‘ Solutions Group. “An organic substrate is low-loss, low-cost, but it doesn’t have the density. In between, there are a bunch of solutions that deliver on some of that, but not for the same cost.”

There are other reasons to pick a substrate material, as well. “Silicon interposer comes from a foundry, so availability is a problem,” says Manuel Mota, senior staff product manager in Synopsys’ Solutions Group. “Some companies are facing challenges in sourcing advanced packages because capacity is taken. By going to other technologies that have a little less bandwidth density, but perhaps enough for your application, you can find them elsewhere. That’s becoming a critical aspect.”

All of these technologies are progressing rapidly, however. “The reticle limit is about 858mm square,” says Park. “People are talking about interposers that are perhaps four times that size, but we have laminates that go much bigger. Some of the laminate substrates coming from Japan are approaching that same level of interconnect density that we can get from silicon. I personally see more push towards organic substrates. Chip-on-Wafer-on-Substrate (CoWoS) from TSMC uses a silicon interposer and has been the technology of choice for about 12 years. More recently they introduced CoWoS-R, which uses film polyamide, closer to an organic type of substrate. Now we hear a lot about glass substrates.”

Over time, the total real estate inside the package may grow. “It doesn’t make sense for foundries to continue to build things the size of a 30-inch printed circuit board,” adds Park. “There are materials that are capable of addressing the bigger designs. Where we really need density is die-to-die. We want those chiplets right next to each other, a couple of millimeters of interconnect length. We want things very short. But the rest of it is just fanning out the I/O so that it connects to the PCB.”

This is why bridges are popular. “We do see a progression to bridges for the high-speed part of the interface,” say Synopsys’ Stahn. “The back side of it would be fanout, like RDL fanout. We see RDL packages that are going to be more like traditional packages going forward.”

Interposers offer additional capabilities. “Today, 99% of the interposers are passive,” says Park. “There’s no front end of line, there are no device layers. It’s purely back end of line processing. You are adding three, four, five metal layers to that silicon. That’s what we call a passive interposer. It’s just creating that die-to-die interconnect. But there are people taking that die and making it an active interposer, basically adding logic to that.”

That can happen for different purposes. “You already see some companies doing active interposers, where they add power management or some of the controls logic,” says Mota. “When you start putting active circuits on interposer, is it still a 2.5D integration, or does it become a 3D integration? We don’t see a big trend toward active interposers today.”

There are some new issues, though. “You have to consider coefficients of thermal expansion (CTE) mismatches,” says Stahn. “This happens whenever two materials with different CTEs are bonded together. Let’s start with the silicon interposer. You can get higher wattage systems, where the SoCs can be talking to their peers, and that can consume a lot of power. A silicon interposer still has to go in a package. The CTE mismatches are between the silicon to the package material. And with the bridge, you’re using it where you need it, but it’s still silicon die-to-die. You have to do the thermal mechanical analysis to make sure that the power that you’re delivering, and the CTE mismatches that you have, result in a viable system.”

While signal lengths in theory can get longer, this poses some problems. “When you’re making those long connections inside a chip, you typically limit those routes to a couple of millimeters, and then you buffer it,” says Mastroianni. “The problem with a passive silicon interposer is there are no buffers. That can really become a serious issue. If you do need to make those connections, you need to plan those out very carefully. And you do need to make sure you’re running timing analysis. Typically, your package guys are not going to be doing that analysis. That’s more of a problem that’s been solved with static timing analysis by silicon engineers. We do need to introduce an STA flow and deal with all the extractions that include organic and silicon type traces, and it becomes a new problem. When you start getting into some of those very long traces, your simple RC timing delays, which are assumed in normal STA delay calculators, don’t account for some of the inductance and mutual inductance between those traces, so you can get serious accuracy issues for those long traces.”

Active interposers help. “With active interposers, you can overcome some of the long-distance problems by putting in buffers or signal repeaters,” says Swinnen. “Then it starts looking more like a chip again, and you can only do it on silicon. You have the EMIB technology from Intel, where they embedded chiplet into the interposer and that’s an active bridge. The chip talks to the EMIB chip, and they both talk to you through this little active bridge chip, which is not exactly an active interposer, but acts almost like an active interposer.”

But even passive components add value. “The first thing that’s being done is including trench capacitors in the interposer,” says Mastroianni. “That gives you the ability to do some good decoupling, where it counts, close to the die. If you put them out on the board, you lose a lot of the benefits for the high-speed interfaces. If you can get them in the interposer, sitting right under where you have the fast-switching speed signals, you can get some localized decoupling.”

In addition to different materials, there is the question of who designs the interposer. “The industry seems to think of it as a little PCB in the context of who’s doing the design,” says Matt Commens, senior manager for product management at Ansys. “The interposers are typically being designed by packaging engineers, even though they are silicon processes. This is especially true for the high-performance ones. It seems counterintuitive, but they have that signal integrity background, they’ve been designing transmission lines and minimizing mismatch at interconnects. A traditional IC designer works from a component point of view. So certainly, the industry is telling us that the people they’re assigning to do that design work are packaging type of personas.”

Power
There are some considerable differences in routing between PCBs and interposers. “Interposer routing is much easier, as the number of components is drastically reduced compared to the PCB,” says Andy Heinig, head of department for efficient electronics at Fraunhofer IIS/EAS. “On the other hand, the power grid on the interposer is much more complex due to the higher resistance of the metal layers and the fact that the power grid is cut out by signal wires. The routing for the die-to-die interface is more complex due to the routing density.”

Power delivery looks very different. “If you look at a PCB, they put these big metal pour areas embedded in the layers, and they void out areas where things need to go through,” says Park. “You put down a bunch of copper and then you void out the others. We can’t build an interposer that way. We have to deposit the interconnect, so the power and ground structures on a silicon interposer will look more like a digital chip. But the signal will look more like a PCB or laminate package.”

Routing does look more like a PCB than a chip. “You’ll see things like teardrops or fillets where it makes a connection to a pad or via to create better yield,” adds Park. “The routing styles today are more aligned to PCBs than they are to a digital IC, where you just have 90° orthogonal corners and clean routing channels. For interposers, whether it’s silicon or organic, the via is often bigger than the wire, which is a classic PCB problem. The routers, if we’re talking about digital, is again more like a small PCB than a die.”

TSVs can create problems, too. “If you’re going to treat them as square, you’re losing a lot of space at the corners,” says Swinnen. “You really want 45° around those objects. Silicon routers are traditionally Manhattan, although there has been a long tradition of RDL routing, which is the top layer where the bumps are connected. That has traditionally used octagonal bumps or round bumps, and then 45° routing. It’s not as flexible as the PCB routing, but they have redistribution layer routers, and also they have some routers that come from the full custom side which have full river routing.”

Related Reading
True 3D Is Much Tougher Than 2.5D
While terms often are used interchangeably, they are very different technologies with different challenges.
Thermal Integrity Challenges Grow In 2.5D
Work is underway to map heat flows in interposer-based designs, but there’s much more to be done.

The post 2.5D Integration: Big Chip Or Small PCB? appeared first on Semiconductor Engineering.

Commercial Chiplet Ecosystem May Be A Decade Away

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

L-R: Arteris’ Schirrmeister, Cadence’s Bhatnagar, Expedera’s Karazuba, Keysight’s Slater, Siemens EDA’s Rinebold, and Synopsys’ Posner.

SE: There’s a lot of buzz and activity around every aspect of chiplets today. What is your impression of where the commercial chiplet ecosystem stands today?

Schirrmeister: There’s a lot of interest today in an open chiplet ecosystem, but we are probably still quite a bit away from true openness. The proprietary versions of chiplets are alive and kicking out there. We see them in designs. We vendors are all supporting those to make it reality, like the UCIe proponents, but it will take some time to get to a fully open ecosystem. It’s probably at least three to five years before we get to a PCI Express type exchange environment.

Bhatnagar: The commercial chiplet ecosystem is at a very early stage. Many companies are providing chiplets, are designing them, and they’re shipping products — but they’re still single-vendor products, where the same company is designing all the pieces. I hope that with the advancements the UCIe standard is making, and with more standardization, we eventually can get to a marketplace-like environment for chiplets. We are not there.

Karazuba: The commercialization of homogeneous chiplets is pretty well understood by groups like AMD. But for the commercialization of heterogeneous chiplets, which is chiplets from multiple suppliers, there are still a lot of questions out there about that.

Slater: We participate in a lot of the board discussions, and attend industry events like TSMC’s OIP, and there’s a lot of excitement out there at the moment. I see a lot of even midsize and small customers starting to think about their development plans for what chiplet should be. I do think those that are going to be successful first will be those that are within a singular foundry ecosystem like TSMC’s. Today if you’re selecting your IP, you’ve got a variety of ways to be able to pick and choose which IP, see what’s been taped out before, how successful it’s been so you have a way to manage your risk and your costs as you’re putting things together. What we’ll see in the future will be that now you have a choice. Are you purchasing IP, or are you purchasing chiplets? Crucially, it’s all coming from the same foundry and put together in the same manner. The technical considerations of things like UCIe standard packaging versus advanced packaging, and the analysis tool sets for high-speed simulation, as well as for things like thermal, are going to just become that much more important.

Rinebold: I’ve been doing this about 30 years, so I can date back to some of the very earliest days of multi-chip modules and such. When we talk about the ecosystem, there are plenty of examples out there today where we see HBM and logic getting combined at the interposer level. This works if you believe HBM is a chiplet, and that’s a whole other argument. Some would argue that HBM falls into that category. The idea of a true LEGO, snap-together mix and match of chiplets continues to be aspirational for the mainstream market, but there are some business impediments that need to get addressed. Again, there are exceptions in some of the single-vendor solutions, where it’s more or less homogeneous integration, or an entirely vertically integrated type of environment where single vendors are integrating their own chiplets into some pretty impressive packages.

Posner: Aspirational is the word we use for an open ecosystem. I’m going to be a little bit more of a downer by saying I believe it’s 5 to 10 years out. Is it possible? Absolutely. But the biggest issue we see at the moment is a huge knowledge gap in what that really means. And as technology companies become more educated on really what that means, we’ll find that there will be some acceleration in adoption. But within what we call ‘captive’ — within a single company or a micro-ecosystem — we’re seeing multi-die systems pick up.

SE: Is it possible to define the pieces we have today from a technology point of view, to make a commercial chiplet ecosystem a reality?

Rinebold: What’s encouraging is the development of standards. There’s some adoption. We’ve already mentioned UCIe for some of the die-to-die protocols. Organizations like JEDEC announced the extension of their JEP30 PartModel format into the chiplet ecosystem to incorporate chiplet-style data. Think about this as an electronic data sheet. A lot of this work has been incorporated into the CDX working group under Open Compute. That’s encouraging. There were some comments a little bit earlier about having an open marketplace. I would agree we’re probably 3 to 10 years away from that coming to fruition. The underlying framework and infrastructure is there, but a lot of the licensing and distribution issues have to get resolved before you see any type of broad adoption.

Posner: The infrastructure is available. The EDA tools to create, to package, to analyze, to simulate, to manufacture — those tools are all there. The intellectual property that sits around it, either UCIe or some of the more traditional die-to-die interfaces, all of that’s there. What’s not established are full methodology and flows that lead to interoperability. Everything within captive is possible, but a broader ecosystem, a marketplace, is going to require silicon interoperability, simulation, packaging, all of that. That’s the area that we believe is missing — and still building.

Schirrmeister: Do we know what’s required? We probably can define that reasonably well. If the vision is an open ecosystem with IP on chiplets that you can just plug together like LEGO blocks, then the IP industry informs us of what’s required, and then there are some gaps on top of them. I hear people from the hard-coded IP world talking about the equivalent of PDKs for chiplets, but today’s IP ecosystem and the IP deliverables are informing us it doesn’t work like LEGO blocks yet. We are improving every year. But this whole, ‘I take my whiteboard and then everything just magically functions together’ is not what we have today. We need to think really hard about what the additional challenges are when you disaggregate that into chiplets and protocols. Then you get big systemic issues to deal with, like how do you deal with coherency across chiplets? It was challenging enough to get it done on a chip. Now you potentially have to deal with other partnerships you don’t even own. This is not a captive environment in an open ecosystem. That makes it very challenging, and it creates job security for at least 5 to 10 years.

Bhatnagar: On the technical side, what’s going well is adoption. We can see big companies like Intel, and then of course, IP providers like us and Synopsys. Everybody’s working toward standardizing chiplet integration, and that is working very well. EDA tools are also coming up to support that. But we are still very far from a marketplace because there are many issues that are not sorted out, like licensing and a few other things that need a bit more time.

Slater: The standards bodies and networking groups have excited a lot of people, and we’re getting a broad set of customers that are coming along. And one point I was thinking, is this only for very high-end compute? From the companies that I see presenting in those types of forums, it’s even companies working in automotive or aerospace/defense, planning out their future for the next 10 years or more. In the automotive case, it was a company that was thinking about creating chiplets for internal consumption — so maybe reorganizing how they look at creating many different variations or evolutions of their products, trying to do it as more modular chiplet types of blocks. ‘If we take the microprocessor part of it, would we sell that as a chiplet externally for other customers to integrate together into a bigger design?’ For me, the aha moment was seeing how broad the application would be. I do think that the standards work has been moving very fast, and that’s worked really well. For instance, at Keysight EDA, we just released a chiplet PHY designer. It’s a simulation for the high-speed digital link for UCIe, and that only comes about by having a standard that’s published, so an EDA company can take a look at it and realize what they need to do with it. The EDA tools are ready to handle these kinds of things. And maybe then, on to the last point is, in order to share the IP, in order to ensure that it’s available, database and process management is going to become all the more important. You need to keep track of which chip is made on which process, and be able to make it available inside the company to other potential users of that.

SE: What’s in place today from a business perspective, and what still needs to be worked out?

Karazuba: From a business perspective, speaking strictly of heterogeneous chiplets, I don’t think there’s anything really in place. Let me qualify that by asking, ‘Who is responsible for warranty? Who is responsible for testing? Who is responsible for faults? Who is responsible for supply chain?’ With homogeneous chiplets or monolithic silicon, that’s understood because that’s the way that this industry has been doing business since its inception. But when you talk about chiplets that are coming from multiple suppliers, with multiple IPs — and perhaps different interfaces, made in multiple fabs, then constructed by a third party, put together by a third party, tested by a fourth party, and then shipped — what happens when something goes wrong? Who do you point the finger at? Who do you go to and talk to? If a particular chiplet isn’t functioning as intended, it’s not necessarily that chiplet that’s bad. It may be the interface on another chiplet, or on a hub, whatever it might be. We’re going to get there, but right now that’s not understood. It’s not understood who is going to be responsible for things such as that. Is it the multi-chip module manufacturer, or is it the person buying it? I fear a return to the Wintel issue, where the chipmaker points to the OS maker, which points at the hardware maker, which points at the chipmaker. Understanding of the commercial side is is a real barrier to chiplets being adopted. Granted, the technical is much more difficult than the commercial, but I have no doubt the engineers will get there quicker than the business people.

Rinebold: I completely agree. What are the repercussions, warranty-related issues, things like that? I’d also go one step further. If you look at some of the larger silicon foundries right now, there is some concern about taking third-party wafers into their facilities to integrate in some type of heterogeneous, chiplet-type package. There are a lot of business and logistical issues that have to get addressed first. The technical stuff will happen quickly. It’s just a lot of these licensing- and distribution-type issues that need to get resolved. The other thing I want to back up to involves customers in the defense/industrial space. The trust and traceability and the province tracking of IP is going to be key for them, because they have so much expectation of multi-die or chiplet-type packaging as an alternative to monolithic scaling. Just look at all the government programs out there right now, with RESHAPE [Reshore Ecosystem for Secure Heterogeneous Advanced Packaging Electronics] and NGMM [Next-Generation Microelectronics Manufacturing] and such. They’re all in on this chiplet perspective, but they’re going to require a lot of security measures to understand who has touched the IP, where it comes from, how to you verify that.

Posner: Micro-ecosystems are forming because of all these challenges. If you naively think you can just go pick a die off the shelf and put it into your device, how do you warranty that? Who owns it? These micro-ecosystems are building up to fundamentally sort that out. So within a couple of different companies, be it automotive or high-performance compute, they’ll come to terms that are acceptable across all of them. And it’s these micro-ecosystems that are really going to end up driving open chiplets, and I think it’s going to be an organic type of growth. Chiplets are available for a specific application today, but we have this vision that someone else could use it, and we see that with the multiple modes being built into the dies. One mode is, ‘I’m connecting to myself. It’s a very tight, low-latency link.’ But I have this vision in the future that I’m going to need to have an interface or protocol that is more open and uses standard available stacks, and which can be bought off the shelf and integrated. That’s one side of the logistics. I want to mention two more things. It is possible to do interoperability across nodes. We demonstrated our TSMC N3 UCIe with Intel’s in-house UCIe, all put together on an Intel process. This was two separate companies working together, showing the first physical interoperability, so it’s possible. But going forward that’s still just a small part of the overall effort. In the IP space we’ve lived with an IP model of, ‘Build once, sell many.’ With the chiplet marketplace, unless there is a revenue stream from that chiplet, it will break that model. Companies think, ‘I only have to buy the IP once, and then I’m selling my silicon.’ But the infrastructure, the resources that are required to build all of this does not go away. There has to be money at the end of that tunnel for all of these different companies to be investing.

Schirrmeister: Mick is 100% right, but we may have a definition issue here with what we really mean by an ‘open’ chiplet ecosystem. I have two distinct conversations when I talk to partners and customers. On the one hand, you have board designers who are doing more and more integration, and they look at you with a wrinkled forehead and say, ‘We’ve been doing this for years. What are you talking about?’ It may not have been 3D-IC in the classic sense of all items, but they say, ‘Yeah, there are issues with warranties, and the user figures it out.’ The board people arrive from one side of the equation at chiplets because that’s the next evolution of integration. You need to be very efficient. That’s not what we call an open ecosystem of chiplets here. The idea is that you have this marketplace to mix things up, and you have the economies of scale by selling the same chiplet to multiple people. That’s really what the chip designers are thinking about, and some of them think even further because if you do it all in true 3D-IC fashion, then you actually have to co-design those chiplets in a way, and that’s a whole other dimension that needs to be sorted out. To pick a little bit on the big companies that have board and chip design groups in house, you see this even within the messaging of these companies. You have people who come from the board side, and for them it’s not a solved problem. It always has been challenging, but they’re going to take it to the next level. The chip guys are looking at this from a perspective of one interface, like PCI Express, now being UCIe. And then I think about this because the networks on chip need to become super NoCs across chiplets, which poses its own challenges. And that all needs to work together. But those are really chiplets designed for the purpose of being in a chiplet ecosystem. And to that end, Mick’s estimation of longer than five years is probably correct because those purpose-built chiplets, for the purpose of being in an open ecosystem, have all these challenges the board guys have already been dealing with for quite some time. They’re now ‘just getting smaller’ in the amount of integration they do.

Slater: When you put all these chiplets together and start to do that integration, in what order do you start placing the components down? You don’t want to throw away one very expensive chiplet because there was an issue with one of the smaller cheaper ones. So, there are now a lot of thoughts about how to go about doing almost like unit tests on individual chiplets first, but then you want to do some form of system test as you go along. That’s definitely something we need to think about. On the business front,  who is going to be most interested in purchasing a chiplet-style solution. It comes down to whether you have a yield problem. If your chips are getting to the size where you have yield concerns, then definitely it makes sense to think about using chiplets and breaking it up into smaller pieces. Not everything scales, so why move to the lowest process node when you could purchase something at a different process node that has less risk and costs less to manufacture, and then put it all together. The ones that have been successful so far — the big companies like Intel, AMD — were already driven to that edge. The chips got to a size that couldn’t fit on the reticle. We think about how many companies fit into that category, and that will factor into whether or not the cost and risk is worth it for them.

Bhatnagar: From a business perspective, what is really important is the standardization. Inside of the chiplets is fine, but how it impacts other chiplets around it is important. We would like to be able to make something and sell many copies of it. But if there is no standardization, then either we are taking a gamble by going for one thing and assuming everybody moves to it, or we make multiple versions of the same thing and that adds extra costs. To really justify a business case for any chiplet or, or any sort of IP with the chiplet, the standardization is key for the electrical interconnect, packaging, and all other aspects of a system.

Fig. 1:  A chiplet design. Source: Cadence. 

Related Reading
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.
Proprietary Vs. Commercial Chiplets
Who wins, who loses, and where are the big challenges for multi-vendor heterogeneous integration.

The post Commercial Chiplet Ecosystem May Be A Decade Away appeared first on Semiconductor Engineering.

Accellera Preps New Standard For Clock-Domain Crossing

Part of the hierarchical development flow is about to get a lot simpler, thanks to a new standard being created by Accellera. What is less clear is how long will it take before users see any benefit.

At the register transfer level (RTL), when a data signal passes between two flip flops, it initially is assumed that clocks are perfect. After clock-tree synthesis and place-and-route are performed, there can be considerable timing skew between the clock edges arriving those adjacent flops. That makes timing sign-off difficult, but at least the clocks are still synchronous.

But if the clocks come from different sources, are at different frequencies, or a design boundary exists between the flip flops — which would happen with the integration of IP blocks — it’s impossible to guarantee that no clock edges will arrive when the data is unstable. That can cause the output to become unknown for a period of time. This phenomenon, known as metastability, cannot be eliminated, and the verification of those boundaries is known as clock-domain crossing (CDC) analysis.

Special care is required on those boundaries. “You have to compensate for metastability by ensuring that the CDC crossings follow a specific set of logic design principles,” says Prakash Narain, president and CEO of Real Intent. “The general process in use today follows a hierarchical approach and requires that the clock-domain crossing internal to an IP is protected and safe. At the interface of the IP, where the system connects with the IP, two different teams share the problem. An IP provider may recommend an integration methodology, which often is captured in an abstraction model. That abstraction model enables the integration boundary to be verified while the internals of it will not be checked for CDC. That has already been verified.”

In the past, those abstract models differentiated the CDC solutions from veracious vendors. That’s no longer the case. Every IP and tool vendor has different formats, making it costly for everyone. “I don’t know that there’s really anything new or differentiating coming down the pipe for hierarchical modeling,” says Kevin Campbell, technical product manager at Siemens Digital Industries Software. “The creation of the standard will basically deliver much faster results with no loss of quality. I don’t know how much more you can differentiate in that space other than just with performance increases.”

While this has been a problem for the whole industry for quite some time, Intel decided it was time for a solution. The company pushed Accellera to take up the issue, and helped facilitate the creation of the standard by chairing the committee. “I’m going to describe three methods of building a product,” says Iredamola “Dammy” Olopade, chair of the Accellera working group, and a principal engineer at Intel. “Method number one is where you build everything in a monolithic fashion. You own every line of code, you know the architecture, you use the tool of your choice. That is a thing of the past. The second method uses some IP. It leverages reuse and enables the quick turnaround of new SoCs. There used to be a time when all IPs came from the same source, and those were integrating into a product. You could agree upon the tools. We are quickly moving to a world where I need to source IPs wherever I can get them. They don’t use the same tools as I do. In that world, common standards are critical to integrating quickly.”

In some cases, there is a hierarchy of IP. “Clock-domain crossings are a central part of our business,” says Frank Schirrmeister, vice president of solutions and business development at Arteris. “A network-on-chip (NoC) can be considered as ‘CDC central’ because most blocks connected to the NoC have different clocks. Also, our SoC integration tools see all of the blocks to be integrated, and those touch various clock domains and therefore need to deal with the CDC code that is inserted.”

This whole thing can become very messy. “While every solution supports hierarchical modeling, every tool has its own model solution and its own model representation,” says Siemens’ Campbell. “Vendors, or users, are stuck with a CDC solution, because the models were created within a certain solution. There’s no real transportability between any of the hierarchical modeling solutions unless they want to go regenerate models for another solution.”

That creates a lot of extra work. “Today, when dealing with customer CDC issues, we have to consider the customer’s specific environment, and for CDC, a potential mix of in-house flows and commercial tools from various vendors,” says Arteris’ Schirrmeister. “The compatibility matrix becomes very complex, very fast. If adopted, the new Accellera CDC standard bears the potential to make it easier for IP vendors, like us, to ensure compatibility and reduce the effort required to validate IP across multiple customer toolsets. The intent, as specified in the requirements is that ‘every IP provider can run its tool of choice to verify and produce collateral and generate the standard format for SoCs that use a different tool.'”

Everyone benefits. “IP providers will not need to provide extra documentation of clock domains for the SoC integrator to use in their CDC analysis,” says Ahmed Nasr, digital design manager at Mixel. “The standard CDC attributes generated by the EDA tool will be self-contained.”

The use model is relatively simple. “An IP developer signs off on CDC and then exports the abstract model,” says Real Intent’s Narain. “It is likely they will write this out in both the Accellera format and the native format to provide backward compatibility. At the next level of hierarchy, you read in the abstract model instead of reading in the full view of the design. They have various views of the IP, including the CDC view of the IP, which today is on the basis of whatever tool they use for CDC sign-off.”

The potential is significant. “If done right and adopted, the industry may arrive at a common language to describe CDC aspects that can streamline the validation process across various tools and environments used by different users,” says Schirrmeister. “As a result, companies will be able to integrate and validate IP more efficiently than before, accelerating development cycles and reducing the complexity associated with SoC integration.”

The standard
Intel’s Olopade describes the approach that was taken during the creation of the standard. “You take the most complex situations you are likely to find, you box them, and you co-design them in order to reduce the risk of bugs,” he said. “The boundaries you create are supposed to be simple boundaries. We took that concept, and we brought it into our definition to say the following: ‘We will look at all kinds of crossings, we will figure out the simple common uses, and we will cover that first.’ That is expected to cover 95% to 98% of the community. We are not trying to handle 700 different exceptions. It is common. It is simple. It is what guarantees production quality, not just from a CDC standpoint, but just from a divide-and-conquer standpoint.”

That was the starting point. “Then we added elements to our design document that says, ‘This is how we will evaluate complexity, and this is how we’ll determine what we cover first,'” he says. “We broke things down into three steps. Step one is clock-domain crossing. Everyone suffers from this problem. Step two is reset-domain crossing (RDC). As low power is getting into more designs, there are a lot more reset domains, and there is risk between these reset domains. Some companies care, but many companies don’t because they are not in a power-aware environment. It became a secondary consideration. Beyond the basic CDC in phase one, and RDC in phase two, all other interesting, small usage complexities will be handled in phase three as extensions to the standard. We are not going to get bogged down supporting everything under the sun.”

Within the standards group there are two sub-groups — a mapping team and a format team. Common standards, such as AMBA, UCIe, and PCIe have been looked at to make sure that these are fully covered by the standard. That means that the concepts should be useful for future markets.

“The concepts contained in the standard are extensible to hardened chiplets,” says Mixel’s Nasr. “By providing an accurate standard CDC view for the chiplet, it will enable integration with other chiplets.”

Some of those issues have yet to be fully explored. “The standard’s current documentation primarily focuses on clock-domain crossing within an SoC itself,” says Schirrmeister. “Its direct applicability to the area of chiplets would depend on further developments. The interfaces between fully hardened IP blocks on chiplets would communicate through standard interfaces like UCIe, BoW, or XSR, so the synchronization issues between chiplets on substrates would appear to be elevated to the protocol levels.”

Reset-domain crossings have yet to appear in the standard. “The genesis of CDC is asynchronous clocks,” says Narain. “But the genesis for reset-domain crossing is asynchronous resets. While the destination is due to the clock, the source of the problem is somewhere else. And as a result, the nature of the problem, the methodology that people use to manage that problem, are very different. The kind of information that you need to retain, and the kind of information that you can throw away, is different for every problem. Hence, abstractions are actually very customized for the application.”

Does the standard cover enough ground? That is part of the purpose of the review period that was used to collect information. “I can see some room for future improvement — for example, making some attributes mandatory like logic, associated_clocks, clock_period for clock ports,” says Nasr. “Another proposed improvement is adding reconvergence information, to be able to detect reconverging outputs of parallel synchronizers.”

The impact of all of this, if realized, is enormous. “If you truly run a collaborative, inclusive, development cycle, two things will happen,” says Olopade. “One, you are going to be able to find multiple ways to solve each problem. You need to understand the pros and cons against the real problems you are trying to solve and agree on the best way we should do it together. For each of those, we record the options, the pros and cons, and the reason one was selected. In a public review, those that couldn’t be part of that discussion get to weigh in. We weigh it against what they are suggesting versus why did we choose it. In the cases where it is part of what we addressed, and we justified it, we just respond, and we do not make a change. If you’re truly inclusive, you do allow that feedback to cause you to change your mind. We received feedback on about three items that we had debated, where the feedback challenged the decisions and got us to rehash things.”

The big challenge
Still, the creation of a standard is just the first step. Unless a standard is fully adopted, its value becomes diminished. “It’s a commendable objective and a worthy endeavor,” says Schirrmeister. “It will make interoperability easier and eventually allow us, and the whole industry, to reduce the compatibility matrix we maintain to deal with vendor tools individually. It all will depend on adoption by the vendors, though.”

It is off to a good start. “As with any standard, good intentions sometimes get severed by reality,” says Campbell. “There has been significant collaboration and agreements on how the standard is being pushed forward. We did not see self-submarining, or some parties playing nice just to see what’s going on but not really supporting it. This does seem like good collaboration and good decision making across the board.”

Implementation is another hurdle. “Will it actually provide the benefit that it is supposed to provide?” asks Narain. “That will depend upon how completely and how quickly EDA tool vendors provide support for the standard. From our perception, the engineering challenge for implementing this is not that large. When this is standardized, we will provide support for it as soon as we can.”

Even then, adoption isn’t a slam dunk. “There are short- and long-term problems,” warns Campbell. “IP vendors already have to support multiple formats, but now you have to add Accellera on top of that. There’s going to be some pain both for the IP vendors and for EDA vendors. We are going to have to be backward-compatible and some programs go on for decades. There’s a chance that some of these models will be around for a very long time. That’s the short-term pain. But the biggest hurdle to overcome for a third-party IP vendor, and EDA vendor, is quality assurance. The whole point of a hierarchical development methodology is faster CDC closure with no loss in quality. The QA load here is going to be big, because no customer is going to want to take the risk if they’ve got a solution that is already working well.”

Some of those issues and fears are expected to be addressed at the upcoming DVCon conference. “We will be providing a tutorial on CDC,” says Olopade. “The first 30 minutes covers the basics of CDC for those who haven’t been doing this for the last 10 years. The next hour will talk about the Accellera solution. It will concentrate on those topics which were hotly debated, and we need to help people understand, or carry people along with what we recommend. Then it may become more acceptable and more adoptive.”

Related Reading
Design And Verification Methodologies Breaking Down
As chips become more complex, existing tools and methodologies are stretched to the breaking point.

The post Accellera Preps New Standard For Clock-Domain Crossing appeared first on Semiconductor Engineering.

Blog Review: Feb. 21

Siemens’ John McMillan digs into physical verification maturity for high-density advanced packaging (HDAP) designs and major differences in the LVS verification flow compared to the well-established process for SoCs.

Synopsys’ Varun Shah identifies why a cloud adoption framework is key to getting the most out of deploying EDA tools in the cloud, including by ensuring that different types of necessary compute are accessible for all stages of the design cycle.

Cadence’s Reela Samuel suggests that a chiplet-based approach will provide improved performance and reduced complexity for the automotive sector, enabling OEMs to construct a robust yet flexible electronic architecture.

Keysight’s Emily Yan finds that today’s chip design landscape is facing challenges reminiscent of those encountered by the Large Hadron Collider in managing data volume, version control, and global collaboration.

Ansys’ Raha Vafaei explains why the finite-difference time-domain (FDTD) method, an algorithmic approach to solving Maxwell’s equations, is key for modeling nanophotonic devices, processes, and materials.

Arm’s Ed Player explains the different components of the Common Microcontroller Software Interface Standard (CMSIS) to help identify which are useful for particular Arm-based microcontroller projects.

SEMI’s Mark da Silva, Nishita Rao and Karim Somani check out the state of digital twins in semiconductor manufacturing and challenges such as the need for standardization and communication between different digital twins.

Plus, check out the blogs featured in the latest Low Power-High Performance newsletter:

Rambus’ Lou Ternullo looks at why performance demands of generative AI and other advanced workloads will require new architectural solutions enabled by CXL.

Ansys’ Raha Vafaei shines a light on how the evolution of photonics engineering will encompass novel materials and cutting-edge techniques.

Siemens’ Keith Felton explains why embracing emerging approaches is essential for crafting IC packages that address the evolving demands of sustainability, technology, and consumer preferences.

Cadence’s Mark Seymour lays out how CFD simulation software can predict time-dependent aspects and various failure scenarios for data center managers.

Arm’s Adnan Al-Sinan and Gian Marco Iodice point out that LLMs already run well on small devices, and that will only improve as models become smaller and more sophisticated.

Keysight’s Roberto Piacentini Filho shows how a modular approach can improve yield, reduce cost, and improve PPA/C.

Quadric’s Steve Roddy finds that smart local memory in an AI/ML subsystem solves SoC bottlenecks.

Synopsys’ Ian Land, Kenneth Larsen, and Rob Aitken detail why the traditional approach using monolithic system-on-chips (SoCs) falls short when addressing the complex needs of modern systems.

The post Blog Review: Feb. 21 appeared first on Semiconductor Engineering.

Why Chiplets Are So Critical In Automotive

Od: John Koon

Chiplets are gaining renewed attention in the automotive market, where increasing electrification and intense competition are forcing companies to accelerate their design and production schedules.

Electrification has lit a fire under some of the biggest and best-known carmakers, which are struggling to remain competitive in the face of very short market windows and constantly changing requirements. Unlike in the past, when carmakers typically ran on five- to seven-year design cycles, the latest technology in vehicles today may well be considered dated within several years. And if they cannot keep up, there is a whole new crop of startups producing cheap vehicles with the ability to update or change out features as quickly as a software update.

But software has speed, security, and reliability limitations, and being able to customize the hardware is where many automakers are now putting their efforts. This is where chiplets fit in, and the focus now is on how to build enough interoperability across large ecosystems to make this a plug-and-play market. The key factors to enable automotive chiplet interoperability include standardization, interconnect technologies, communication protocols, power and thermal management, security, testing, and ecosystem collaboration.

Similar to non-automotive applications at the board level, many design efforts are focusing on a die-to-die approach, which is driving a number of novel design considerations and tradeoffs. At the chip level, the interconnects between various processors, chips, memory, and I/O are becoming more complex due to increased design performance requirements, spurring a flurry of standards activities. Different interconnect and interface types have been proposed to serve varying purposes, while emerging chiplet technologies for dedicated functions — processors, memories, and I/Os, to name a few — are changing the approach to chip design.

“There is a realization by automotive OEMs that to control their own destiny, they’re going to have to control their own SoCs,” said David Fritz, vice president of virtual and hybrid systems at Siemens EDA. “However, they don’t understand how far along EDA has come since they were in college in 1982. Also, they believe they need to go to the latest process node, where a mask set is going to cost $100 million. They can’t afford that. They also don’t have access to talent because the talent pool is fairly small. With all that together comes the realization by the OEMs that to control their destiny, they need a technology that’s developed by others, but which can be combined however needed to have a unique differentiated product they are confident is future-proof for at least a few model years. Then it becomes economically viable. The only thing that fits the bill is chiplets.”

Chiplets can be optimized for specific functions, which can help automakers meet reliability, safety, security requirements with technology that has been proven across multiple vehicle designs. In addition, they can shorten time to market and ultimately reduce the cost of different features and functions.

Demand for chips has been on the rise for the past decade. According to Allied Market Research, global automotive chip demand will grow from $49.8 billion in 2021 to $121.3 billion by 2031. That growth will attract even more automotive chip innovation and investment, and chiplets are expected to be a big beneficiary.

But the marketplace for chiplets will take time to mature, and it will likely roll out in phases.  Initially, a vendor will provide different flavors of proprietary dies. Then, partners will work together to supply chiplets to support each other, as has already happened with some vendors. The final stage will be universally interoperable chiplets, as supported by UCIe or some other interconnect scheme.

Getting to the final stage will be the hardest, and it will require significant changes. To ensure interoperability, large enough portions of the automotive ecosystem and supply chain must come together, including hardware and software developers, foundries, OSATs, and material and equipment suppliers.

Momentum is building
On the plus side, not all of this is starting from scratch. At the board level, modules and sub-systems always have used onboard chip-to-chip interfaces, and they will continue to do so. Various chip and IP providers, including Cadence, Diode, Microchip, NXP, Renesas, Rambus, Infineon, Arm, and Synopsys, provide off-the-shelf interface chips or IP to create the interface silicon.

The Universal Chiplet Interconnect Express (UCIe) Consortium is the driving force behind the die-to-die, open interconnect standard. The group released its latest UCIe 1.1 specification in August 2023. Board members include Alibaba, AMD, Arm, ASE, Google Cloud, Intel, Meta, Microsoft, NVIDIA, Qualcomm, Samsung, and others. Industry partners are showing widespread support. AIB and Bunch of Wires (BoW) also have been proposed. In addition, Arm just released its own Chiplet System Architecture, along with an updated AMBA spec to standardize protocols for chiplets.

“Chiplets are already here, driven by necessity,” said Arif Khan, senior product marketing group director for design IP at Cadence. “The growing processor and SoC sizes are hitting the reticle limit and the diseconomies of scale. Incremental gains from process technology advances are lower than rising cost per transistor and design. The advances in packaging technology (2.5D/3D) and interface standardization at a die-to-die level, such as UCIe, will facilitate chiplet development.”

Nearly all of the chiplets used today are developed in-house by big chipmakers such as Intel, AMD, and Marvell, because they can tightly control the characteristics and behavior of those chiplets. But there is work underway at every level to open this market to more players. When that happens, smaller companies can begin capitalizing on what the high-profile trailblazers have accomplished so far, and innovating around those developments.

“Many of us believe the dream of having an off-the-shelf, interoperable chiplet portfolio will likely take years before becoming a reality,” said Guillaume Boillet, senior director strategic marketing at Arteris, adding that interoperability will emerge from groups of partners who are addressing the risk of incomplete specifications.

This also is raising the attractiveness of FPGAs and eFPGAs, which can provide a level of customization and updates for hardware in the field. “Chiplets are a real thing,” said Geoff Tate, CEO of Flex Logix. “Right now, a company building two or more chiplets can operate much more economically than a company building near-reticle-size die with almost no yield. Chiplet standardization still appears to be far away. Even UCIe is not a fixed standard yet. Not all agree on UCIe, bare die testing, and who owns the problem when the integrated package doesn’t work, etc. We do have some customers who use or are evaluating eFPGA for interfaces where standards are in flux like UCIe. They can implement silicon now and use the eFPGA to conform to standards changes later.”

There are other efforts supporting chiplets, as well, although for somewhat different reasons — notably, the rising cost of device scaling and the need to incorporate more features into chips, which are reticle-constrained at the most advanced nodes. But those efforts also pave the way for chiplets in automotive, and there is strong industry backing to make this all work. For example, under the sponsorship of SEMI, ASME, and three IEEE Societies, the new Heterogeneous Integration Roadmap (HIR) looks at various microelectronics design, materials, and packaging issues to come up with a roadmap for the semiconductor industry. Their current focus includes 2.5D, 3D-ICs, wafer-level packaging, integrated photonics, MEMS and sensors, and system-in-package (SiP), aerospace, automotive, and more.

At the recent Heterogeneous Integration Global Summit 2023, representatives from AMD, Applied Materials, ASE, Lam Research, MediaTek, Micron, Onto Innovation, TSMC, and others demonstrated strong support for chiplets. Another group that supports chiplets is the Chiplet Design Exchange (CDX) working group , which is part of the Open Domain Specific Architecture (ODSA) and the Open Compute Project Foundation (OCP). The Chiplet Design Exchange (CDX) charter focuses on the various characteristics of chiplet and chiplet integration, including electrical, mechanical, and thermal design exchange standards of the 2.5D stacked, and 3D Integrated Circuits (3D-ICs). Its representatives include Ansys, Applied Materials, Arm, Ayar Labs, Broadcom, Cadence, Intel, Macom, Marvell, Microsemi, NXP, Siemens EDA, Synopsys, and others.

“The things that automotive companies want in terms of what each chiplet does in terms of functionality is still in an upheaval mode,” Siemens’ Fritz noted. “One extreme has these problems, the other extreme has those problems. This is the sweet spot. This is what’s needed. And these are the types of companies that can go off and do that sort of work, and then you could put them together. Then this interoperability thing is not a big deal. The OEM can make it too complex by saying, ‘I have to handle that whole spectrum of possibilities.’ The alternative is that they could say, ‘It’s just like a high speed PCIe. If I want to communicate from one to the other, I already know how to do that. I’ve got drivers that are running my operating system. That would solve an awful lot of problems, and that’s where I believe it’s going to end up.”

One path to universal chiplet development?

Moving forward, chiplets are a focal point for both the automotive and chip industries, and that will involve everything from chiplet IP to memory interconnects and customization options and limitations.

For example, Renesas Electronics announced in November 2023 plans for its next-generation SoCs and MCUs. The company is targeting all major applications across the automotive digital domain, including advance information about its fifth-generation R-Car SoC for high-performance applications with advanced in-package chiplet integration technology, which is meant to provide automotive engineers greater flexibility to customize their designs.

Renesas noted that if more AI performance is required in Advanced Driver Assistance Systems (ADAS), engineers will have the capability to integrate AI accelerators into a single chip. The company said this roadmap comes after years of collaboration and discussions with Tier 1 and OEM customers, which have been clamoring for a way to accelerate development without compromising quality, including designing and verifying the software even before the hardware is available.

“Due to the ever increasing needs to increase compute on demand, and the increasing need for higher levels of autonomy in the cars of tomorrow, we see challenges in monolithic solutions scaling and providing the performance needs of the market in the upcoming years,” said Vasanth Waran, senior director for SoC Business & Strategies at Renesas. “Chiplets allows for the compute solutions to scale above and beyond the needs of the market.”

Renesas announced plans to create a chiplet-based product family specifically targeted at the automotive market starting in 2025.

Standard interfaces allow for SoC customization
It is not entirely clear how much overlap there will be between standard processors, which is where most chiplets are used today, and chiplets developed for automotive applications. But the underlying technologies and developments certainly will build off each other as this technology shifts into new markets.

“Whether it is an AI accelerator or ADAS automotive application, customers need standard interface IP blocks,” noted David Ridgeway, senior product manager, IP accelerated solutions group at Synopsys. “It is important to provide fully verified IP subsystems around their IP customization requirements to support the subsystem components used in the customers’ SoCs. When I say customization, you might not realize how customizable IP has become over the course of the last 10 to 20 years, on the PHY side as well as the controller side. For example, PCI Express has gone from PCIe Gen 3 to Gen 4 to Gen 5 and now Gen 6. The controller can be configured to support multiple bifurcation modes of smaller link widths, including one x16, two x8, or four x4. Our subsystem IP team works with customers to ensure all the customization requirements are met. For AI applications, signal and power integrity is extremely important to meet their performance requirements. Almost all our customers are seeking to push the envelope to achieve the highest memory bandwidth speeds possible so that their TPU can process many more transactions per second. Whenever the applications are cloud computing or artificial intelligence, customers want the fastest response rate possible.”

Fig 1: IP blocks including processor, digital, PHY, and verification help developers implement the entire SoC. Source: Synopsys

Fig 1: IP blocks including processor, digital, PHY, and verification help developers implement the entire SoC. Source: Synopsys

Optimizing PPA serves the ultimate goal of increasing efficiency, and this makes chiplets particularly attractive in automotive applications. When UCIe matures, it is expected to improve overall performance exponentially. For example, UCIe can deliver a shoreline bandwidth of 28 to 224 GB/s/mm in a standard package, and 165 to 1317 GB/s/mm in an advanced package. This represents a performance improvement of 20- to 100-fold. Bringing latency down from 20ns to 2ns represents a 10-fold improvement. Around 10 times greater power efficiency, at 0.5 pJ/b (standard package) and 0.25 pJ/b (advanced package), is another plus. The key is shortening the interface distance whenever possible.

To optimize chiplet designs, the UCIe Consortium provides some suggestions:

  • Careful planning consideration of architectural cut-lines (i.e. chiplet boundaries), optimizing for power, latency, silicon area, and IP reuse. For example, customizing one chiplet that needs a leading-edge process node while re-using other chiplets on older nodes may impact cost and time.
  • Thermal and mechanical packaging constraints need to be planned out for package thermal envelopes, hot spots, chiplet placements and I/O routing and breakouts.
  • Process nodes need to be carefully selected, particularly in the context of the associated power delivery scheme.
  • Test strategy for chiplets and packaged/assembled parts need to be developed up front to ensure silicon issues are caught at the chiplet-level testing phase rather than after they are assembled into a package.

Conclusion
The idea of standardizing die-to-die interfaces is catching on quickly but the path to get there will take time, effort, and a lot of collaboration among companies that rarely talk with each other. Building a vehicle takes one determine carmaker. Building a vehicle with chiplets requires an entire ecosystem that includes the developers, foundries, OSATs, and material and equipment suppliers to work together.

Automotive OEMs are experts at putting systems together and at finding innovative ways to cut costs. But it remains to seen how quickly and effectively they can build and leverage an ecosystem of interoperable chiplets to shrink design cycles, improve customization, and adapt to a world in which the leading edge technology may be outdated by the time it is fully designed, tested, and available to consumers.

— Ann Mutschler contributed to this report.

Related Reading
Automotive Relationships Shifting With Chiplets
As the automotive ecosystem balances the best approaches for designing in increasingly advanced features, how companies interact is still evolving.

The post Why Chiplets Are So Critical In Automotive appeared first on Semiconductor Engineering.

❌