FreshRSS

Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.
PředevčíremHlavní kanál
  • ✇Semiconductor Engineering
  • Blog Review: Aug. 21Jesse Allen
    Cadence’s Reela Samuel explores the critical role of PCIe 6.0 equalization in maintaining signal integrity and solutions to mitigate verification challenges, such as creating checkers to verify all symbols of TS0, ensuring the correct functioning of scrambling, and monitoring phase and LTSSM state transitions. Siemens’ John McMillan introduces an advanced packaging flow for Intel’s Embedded Multi-die Interconnect Bridge (EMIB) technology, including technical challenges, design methodologies, and
     

Blog Review: Aug. 21

21. Srpen 2024 v 09:01

Cadence’s Reela Samuel explores the critical role of PCIe 6.0 equalization in maintaining signal integrity and solutions to mitigate verification challenges, such as creating checkers to verify all symbols of TS0, ensuring the correct functioning of scrambling, and monitoring phase and LTSSM state transitions.

Siemens’ John McMillan introduces an advanced packaging flow for Intel’s Embedded Multi-die Interconnect Bridge (EMIB) technology, including technical challenges, design methodologies, and the integration of EMIBs in system-level package designs.

Synopsys’ Dustin Todd checks out what’s next for the U.S. CHIPS and Science Act, including the establishment of the National Semiconductor Technology Center and the allocation of $13 billion for research and development efforts.

Keysight’s Roberto Piacentini Filho explores the challenges of managing the large design files and massive volumes of data involved in a modern chip design project, which can take up as much as a terabyte of disk space and involve hundreds of thousands of files.

Arm’s Sandeep Mistry shows how ML models developed for mobile computer vision applications and requiring tens to hundreds of millions of multiply-accumulate (MACs) operations per inference can be deployed to a modern microcontroller.

Ansys’ Aliyah Mallak explores an effort to manufacture biotech products in microgravity and how simulation helps ensure payloads containing delicate, temperature-sensitive spore samples and bioreactors make it safely to the International Space Station or low Earth orbit safely.

Micron Technology’s Amit Srivastava, ULVAC’s Brian Coppa, and SEMI’s Mark da Silva suggest tackling corporate sustainability goals with a bottom-up approach that leverages various sensing technologies, at the cleanroom, sub-fab, and facilities levels for both greenfield and brownfield device-making facilities, to enable predictive analytics.

And don’t miss the blogs featured in the latest Manufacturing, Packaging & Materials newsletter:

Amkor’s JeongMin Ju shows how to prevent critical failures in copper RDLs caused by overcurrent-induced fusing.

Synopsys’ Al Blais discusses curvilinear checking and fracture requirements for the MULTIGON era.

Lam Research’s Dempsey Deng compares the parasitic capacitance of a 6F2 honeycomb DRAM device to a 4F2 VCAT DRAM structure.

Brewer Science’s Jessica Albright covers debonding methods, thermal, topography, adhesion, and thickness variation.

SEMI’s John Cooney reviews a fireside chat between the President of SEMI Americas and the U.S. Under Secretary of State for Economic Growth, Energy, and the Environment on securing supply chains.

The post Blog Review: Aug. 21 appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • U.S. Proposes Restrictions On Tech Investments In ChinaLinda Christensen
    The U.S. proposed new regulations to curtail American investments in Chinese technologies that pose a national security threat, specifically calling out semiconductors and microelectronics, quantum information technologies, and AI. The draft regulations come nearly a year after the Biden administration issued an executive order prohibiting investments in sensitive technologies used to accelerate China’s military technologies.  “This proposed rule advances our national security by preventing the
     

U.S. Proposes Restrictions On Tech Investments In China

24. Červen 2024 v 09:01

The U.S. proposed new regulations to curtail American investments in Chinese technologies that pose a national security threat, specifically calling out semiconductors and microelectronics, quantum information technologies, and AI.

The draft regulations come nearly a year after the Biden administration issued an executive order prohibiting investments in sensitive technologies used to accelerate China’s military technologies.  “This proposed rule advances our national security by preventing the many benefits certain U.S. investments provide — beyond just capital — from supporting the development of sensitive technologies in countries that may use them to threaten our national security,” said Paul Rosen, assistant secretary of the treasury for investment security, in a release.

Prohibited Semiconductor Transactions

The 165-page proposal defines prohibited semiconductor transactions (pages 133-134), which include:

  • EDA software for the design of ICs or advanced packaging;
  • Front-end semiconductor fab equipment designed for performing the volume fabrication of ICs, including equipment used in the production stages from a blank wafer or substrate to a completed wafer or substrate;
  • Equipment for performing volume advanced packaging;
  • Commodity, material, software, or technology designed exclusively for use in or with EUV lithography equipment;
  • Design of ICs for operation at or below 4.5 Kelvin, and ICs that meet or exceed performance criteria in Export Control Classification Number 3A090;
  • Fabrication of logic ICs using a non-planar transistor architecture, or with a production technology node of 16/14 nanometers or less, including FD-SOI ICs;
  • Fabrication of NAND with 128 layers or more, or DRAM ICs at 18nm half-pitch or less;
  • Fabrication of gallium-based compound semiconductors, or ICs using graphene transistors or carbon nanotubes, and
  • Any IC using advanced packaging techniques.

 

Prohibited AI And Supercomputing Transactions

Prohibited transactions for supercomputers and artificial intelligence (pages 134-135) include:

  • Any supercomputer enabled by advanced ICs that can provide a theoretical compute capacity of 100 or more double-precision (64-bit) petaflops, or 200 or more single-precision (32-bit) petaflops of processing power within a 41,600 cubic foot or smaller envelope;
  • Any quantum computer, or production of any of the critical components required to produce a quantum computer, such as a dilution refrigerator or two-stage pulse tube cryocooler;
  • Quantum sensing platforms for military, government or mass-surveillance end use;
  • Quantum network or quantum communication system, and
  • AI for military end use, weapons targeting, target identification, and military decision-making.

Written comments are due by Aug. 4, 2024. The regulations are expected to be finalized this calendar year.

More Reading:
Chip Industry Week In Review
GF, BAE team up; $2B Czech SiC plant; SEMI’s capacity report; imec’s CFETs, beamforming transmitters; Germany chip plant postponed; EUV patterning advances; interconnect metals; plasma etch.

The post U.S. Proposes Restrictions On Tech Investments In China appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • DAC Panel Could Spark FireworksBrian Bailey
    Panels can often become love fests. While a title may sound controversial, it turns out that everyone quickly finds that all the panelists agree on the major points. This is sometimes the result of how the panel was put together – the proposal came from one company, and they wanted to get their customers or clients onto the panel. They are unlikely to ask a major competitor to be part of the event. These panels can become livelier if they have a moderator who opens up a panel to audience questio
     

DAC Panel Could Spark Fireworks

30. Květen 2024 v 09:08

Panels can often become love fests. While a title may sound controversial, it turns out that everyone quickly finds that all the panelists agree on the major points. This is sometimes the result of how the panel was put together – the proposal came from one company, and they wanted to get their customers or clients onto the panel. They are unlikely to ask a major competitor to be part of the event.

These panels can become livelier if they have a moderator who opens up a panel to audience questions and they decide to throw the spanner in the works. This tends to happen a lot more in the technical panels, because each researcher, who may have taken a different approach to a problem, wants to introduce the audience to their alternative solution. But the pavilion panels tend to be a little more sedate – in part because nobody wants to burn bridges within such a tight industry.

It is quite common for me to moderate a panel each DAC, and this year is no exception. I will be moderating a technical panel whose title is directly confrontational: “Why Is EDA Playing Catchup to Disruptive Technologies Like AI? What Can We Do to Change This?”

The abstract for the panel talks about EDA having a closed mindset, consistently missing disruptive changes by choosing incremental approaches. I know that when I first read it – when I was invited to be the chair for it – I was immediately up in arms.

Twenty years ago, while working at an EDA company, I attempted to drive such disruptive changes in the verification industry. Several times a year, I would go out and talk to our customers and exchange ideas with them about the problems they were facing. We would present ideas about both incremental and disruptive developments we had underway. The message was always the same. “Can we have the incremental changes yesterday? And we don’t have time to think about the longer-term ideas.” It reminded me of the cartoon where a stone-age person is pulling a cart with square wheels and doesn’t have time to listen to the person offering him round ones.

Even so, we did go ahead and develop some of them, and a few of them did achieve an element of success. But to go from first adopters to more mainstream interests often took 10 years. Even today, many of those are still niche tools, and probably money sinks for the companies that developed them. Examples are high-level synthesis and virtual prototypes, the only two pieces of the whole ESL movement that survived. Still, they believe that long term, the industry will need them. Many other pieces completely fell by the wayside, such as hardware/software co-design. That, however, may start to resurface thanks to RISC-V.

Many of the tools associated with ESL were direct collaborations between EDA companies and researchers. I established a research collaboration program with the University of Washington that looked at multi-abstraction simulation, protocol checking and had elements of system synthesis. The only thing that came out of that was hardware software co-verification. Protocol checking, in the form of VIP, also has become popular, although not directly because of this program. Co-verification had a useful life of about five years before SystemC made the solution obsolete.

Many disruptive innovations actually have come from industry, then were commercialized by EDA companies. SystemC is one example of that. Constrained random verification is another. Portable Stimulus, while still nascent, also was developed within industry. These solutions have an advantage in that they were developed to solve a significant enough problem within the industry that they have broader appeal. There is little that has actually come from academia in recent decades.

The panel title also talks specifically about AI and accuses EDA of being behind already. It is not clear that they are. Thirty years ago, you could go to DAC and see all the new tools and flows that EDA companies were working on. Many of them might be ready within a year or two. But today, EDA companies will make no announcements until at least a few of their customers, that they chose as development partners, have had silicon success.

A typical chip cycle is 18 months. Given that we are beginning to hear about some of these tools today means they may have been in use for a good part of that 18 months. Plus, development of those tools must have started about a year before that. Let’s remember that ChatGPT only came to the fore 18 months ago, and it should be quite obvious why few generative AI products have yet been announced. The fact that there are so many EDA AI announcements would make me think that EDA companies were very quick off the starting blocks.

The panelists are Prith Banerjee – Ansys, who has written a book about disruption; Jan Rabaey – professor in the Graduate School of in the Electrical Engineering and Computer Sciences at the University of California, Berkeley, who also serves as the CTO of the Systems Technology Co-Optimization division at imec; Samir Mittal, corporate VP for Silicon Systems AI at Micron Technology; James Scapa, founder and CEO of Altair; and Charles Alpert, fellow at Cadence Design Systems.

If you are going to be at DAC and have access to the technical program, this 90-minute panel may be worth your time. Wednesday June 26th at 10:30am. Come ready with your questions because I will certainly be opening this panel up to the audience very quickly. While sparks may fly, please try and keep your cool and be respectful.

The post DAC Panel Could Spark Fireworks appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Vision Is Why LLMs Matter On The EdgeBen Gomes
    Large Language Models (LLMs) have taken the world by storm since the 2017 Transformers paper, but pushing them to the edge has proved problematic. Just this year, Google had to revise its plans to roll out Gemini Nano on all new Pixel models — the down-spec’d hardware options proved unable to host the model as part of a positive user experience. But the implementation of language-focused models at the edge is perhaps the wrong metric to look at. If you are forced to host a language-focused model
     

Vision Is Why LLMs Matter On The Edge

Od: Ben Gomes
30. Květen 2024 v 09:05

Large Language Models (LLMs) have taken the world by storm since the 2017 Transformers paper, but pushing them to the edge has proved problematic. Just this year, Google had to revise its plans to roll out Gemini Nano on all new Pixel models — the down-spec’d hardware options proved unable to host the model as part of a positive user experience. But the implementation of language-focused models at the edge is perhaps the wrong metric to look at. If you are forced to host a language-focused model for your phone or car in the cloud, that may be acceptable as an intermediate step in development. Vision applications of AI, on the other hand, are not so flexible: many of them rely on low latency and high dependability. If a vehicle relies on AI to identify that it should not hit the obstacle in front of it, a blip in contacting the server can be fatal. Accordingly, the most important LLMs to fit on the edge are vision models — the models whose purpose is most undermined by the reliance on remote resources.

“Large Language Models” can be an imprecise term, so it is worth defining. The original 2017 Transformer LLM that many see as kickstarting the AI rush was 215 million parameters. BERT was giant for its time (2018) at 335 million parameters. Both of these models might be relabeled as “Small Language Models” by some today to distinguish from models like GPT4 and Gemini Ultra with as much as 1.7 trillion parameters, but for the purposes here, all fall under the LLM category. All of these are language models though, so why does it matter for vision? The trick here is that language is an abstract system of deriving meaning from a structured ordering of arbitrary objects. There is no “correct” association of meaning and form in language which we could base these models on. Accordingly, these arbitrary units are substitutable — nothing forces architecture developed for language to only be applied to language, and all the language objects are converted to multidimensional vectors anyway. LLM architecture is thus highly generalizable, and typically retains the core strength from having been developed for language: a strong ability to carry through semantic information. Thus, when we talk about LLMs at the edge, it can be a language model cross-trained on image data, or it might be a vision-only model which is built on the foundation of technology designed for language. At the software and hardware levels, for bringing models to the edge, this distinction makes little difference.

Vision LLMs on the edge flexibly apply across many different use cases, but key applications where they show the greatest advantages are: embodied agents (an especially striking example of the benefits of cross-training embodied agents on language data can be seen with Dynalang’s advantages over DreamerV3 in interpreting the world due to superior semantic parsing), inpainting (as seen with the latent diffusion models), LINGO-2’s decision-making abilities in self-driving vehicles, context-aware security (such as ViViT), information extraction (Gemini’s ability to find and report data from video), and user assistance (physician aids, driver assist, etc). Specifically notable and exciting here is the ability for Vision LLMs to leverage language as a lossy storage and abstraction of visual data for decision-making algorithms to then interact with — especially as seen in LINGO-2 and Dynalang. Many of these vision-oriented LLMs depend on edge deployment to realize their value, and they benefit from the work that has already been done for optimizing language-oriented LLMs. Despite this, vision LLMs are still struggling for edge deployment just as the language-oriented models are. The improvements for edge deployments come in three classes: model architecture, system resource utilization, and hardware optimization. We will briefly review the first two and look more closely at the third since it often gets the least attention.

Model architecture optimizations include the optimizations that must be made at the model level: “distilling” models to create leaner imitators, restructuring where models spend their resource budget (such as the redistribution of transformer modules in Stable Diffusion XL) and pursuing alternate architectures (state-space models, H3 modules, etc.) to escape the quadratically scaling costs of transformers.

System resource optimizations are all the things that can be done in software to an already complete model. Quantization (to INT8, INT4, or even INT2) is a common focus here for both latency and memory burden, but of course compromises accuracy. Speculative decoding can improve utilization and latency. And of course, tiling, such as seen with FlashAttention, has become near-ubiquitous for improving utilization and latency.

Finally, there are hardware optimizations. The first option here is a general-purpose GPU, TPU, NPU or similar, but those tend to be best suited for settings where capability is needed without demanding streamlined optimization such as might be the case on a home computer. Custom hardware, such as purpose-built NPUs, generally has the advantage when the application is especially sensitive to latency or resource consumption, and this covers much of the applications for vision LLMs.

Exploring this trade-off further: Stable Diffusion’s architecture and resource demands have been discussed here before, but it is worth circling back to it as an example of why hardware solutions are so important in this space. Using Stable Diffusion 1.5 for simplicity, let us focus specifically on the U-Net component of the model. In this diagram, you can see the rough construction of the model: it downsamples repeatedly on the left until it hits the bottom of the U, and then upsamples up the right side, bringing back in residual connections from the left at each stage.

This U-Net implementation has 865 million parameters and entails 750 billion operations. The parameters are a fair proxy for the memory burden, and the operations are a direct representation of the compute demands. The distribution of these burdens on resources is not even however. If we plot the parameters and operations for each layer, a clear picture emerges:

These graphs show a model that is destined for gross inefficiencies at every step. Most of the memory burden peaks in the center, whereas the compute is heavily taxed at the two tails but underutilized in the center. These inefficiencies come with costs. The memory peak can overwhelm on-chip storage, thus incurring I/O operations, or else requiring a large excess of unused memory for most of the graph. Similarly, storing residuals for later incurs I/O latency and higher power draws. The underutilization of the compute power at the center of the graph means that the processor will have wasteful power draw as it cannot use the tail of the power curve as it does sparser operations. While software interventions can also help here, this is exactly the kind of problem that custom hardware solutions are meant to address. Custom silicon tailored to the model can let you offload some of that memory burden into additional compute cycles at the center of the graph without incurring extra I/O operations by recomputing the residual connections instead of kicking them out to memory. In doing so, the total required memory drops, and the processor can remain at full utilization. Rightsizing the resource allotment and finding ways to redistribute the burdens are key components to how these models can be best deployed at the edge.

Despite their name, LLMs are important to the vision domain for their flexibility in handling different inputs and their strength at interpreting meaning in images. Whether used for embodied agents, context-aware security, or user assistance, their use at the edge requires a dependable low latency which precludes cloud-based solutions, in contrast to other AI applications on edge devices. Bringing them successfully to the edge asks for optimizations at every level, and we have seen already some of the possibilities at the hardware level. Conveniently, the common architecture with language-oriented LLMs means that many of the solutions needed to bring these most essential models to the edge in turn may also generalize back to the language-oriented models which donated the architecture in the first place.

The post Vision Is Why LLMs Matter On The Edge appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • RISC-V Heralds New Era Of CooperationBrian Bailey
    RISC-V is paving the way for open source to become accepted within the hardware community, creating a level of industry collaboration never seen in the past, while revitalizing the connection between academia and industry. The big question is whether this arrangement is just a placeholder while the industry re-learns how to develop processors, or whether this processor architecture is something very different. In either case, there is a clear and pressing need for more flexible processor archite
     

RISC-V Heralds New Era Of Cooperation

30. Květen 2024 v 09:05

RISC-V is paving the way for open source to become accepted within the hardware community, creating a level of industry collaboration never seen in the past, while revitalizing the connection between academia and industry.

The big question is whether this arrangement is just a placeholder while the industry re-learns how to develop processors, or whether this processor architecture is something very different. In either case, there is a clear and pressing need for more flexible processor architectures, and at least for now, RISC-V has filled a void.

“RISC-V was born out of academia and has had strong collaboration within universities from day one,” says Loren Hobbs, vice president of product and business development at Bluespec. “This collaboration continues today, with many of the most popular open-source RISC-V processors having come from universities. Organizations such as OpenHW Group and CHIPS Alliance serve a central and critical role in driving the collaboration, which is bi-directional between the academic community and industry.”

Collaboration of this type has not existed with the industrial community in the past. “We are learning from each other,” says Florian Wohlrab, CEO at OpenHW. “We are learning best practices for verification. At the same time, we are learning what things to avoid. It is growing where people say, ‘Yes, I really get benefit from sharing ideas.'”

The need for processor flexibility exists within industry as well as academia. “There is a need within the industry for diversification on the processor front,” says Neil Hand, director of marketing at Siemens EDA. “In the past, this led to a fragmented set of companies that couldn’t work together. They didn’t see the value of working together. But RISC-V has a cohesive central organization where anyone who wants to get into the processor space can collaborate. They don’t have to expose their secret sauce, but they get to benefit from each other. A rising tide lifts all boats, and that’s really the situation we’re at with RISC-V.”

Longevity
Whether the industry can build upon this success, or whether it fizzles out over time, remains to be seen. But at least for now, RISC-V’s momentum is growing. “We are at the beginning of a revolution in hardware design,” says OpenHW’s Wohlrab. “We saw the same thing for software when Linux came out 20 or so years ago. No one was really thinking about sharing software or collaboratively developing software. There were some small open-source ventures, but working together on a big project took a long time to develop. Now we are all sharing software, all co-working. But for hardware, we’re just at the beginning of this new concept, and a lot of people need to understand that we can do the same for hardware as we did for software.”

Underlying RISC-V’s success is widespread collaboration. “One of the pillars sustaining the success of RISC-V is customization that works with the ecosystem and leverages a well-defined process,” says Sergio Marchese, vice president of application engineering at SmartDV. “RISC-V vendors face the challenge of showing how their processor customization capabilities serve an application and demonstrating the complete process on real hardware. Without strategic partnerships, RISC-V vendors must walk a much more challenging, time-consuming, and resource-intensive road.”

That framework is what makes it unique. “RISC-V has formed this framework for collaboration, and it fixes everything,” says Siemens’ Hand. “Now, when a university has a really cool idea for memory tagging in a processor design, they don’t have to build the compilers, they don’t have to build the reference platform. They already exist. Maybe a compiler optimization startup has this great idea for handling code optimization. They don’t have to build the rest of the ecosystem. When a processor IP company has this great idea, they can become focused within this bigger picture. That’s the unique nature of it. It’s not just a processor specification.”

Historically, one of the problems associated with open-source hardware was quality, because finding bugs in silicon is expensive. OpenHW is an important piece of the puzzle. “Why should everyone reinvent the wheel by themselves?” asks Wohlrab. “Why can’t we get the basic building blocks, some basic chips, take some design from academia, which has reasonably good quality, and build on them, verify them together. We are verifying with different tools, making sure we get a high coverage, and then everyone can go off and use them in their own chips for mass production, for volume shipment.”

This benefits companies both large and small. “There are several processor vendors that have switched to RISC-V,” says Hand. “Synopsys has moved to RISC-V. Andes has moved to RISC-V. MIPS has moved to RISC-V. Why? Because they can leverage the whole ecosystem. The downside of it is commoditization, which as a customer is really beneficial because you can delay choosing a processor till later in the design flow. Your early decision is to use the Arm ecosystem or RISC-V, and then you can work through it. That creates an interesting set of dynamics. You can start to create new opportunities for companies that develop and deliver IP, because you can benchmark them, swap them in and out, and see which one works. On the flip side, it makes it awful from a lock-in perspective once you’re in that socket.”

Fragmentation
Of course, there will be some friction in the system. “In the early days of RISC-V there was nearly a 1:1 balance between contributors and consumers of the technology,” says Geir Eide, director, product management for Siemens EDA. “Today there are thousands of RISC-V consumers, but only a small percentage of those will be contributors. There is a risk that there will be a disconnect between them. If, for instance, a particular market or regional segment is growing at a higher pace than others, or other market segments and regions are more conservative, they tend to stick to established solutions longer. That increases the risk that it could lead to fragmentation.”

Is that likely to impact development long term? “We do not believe that RISC-V will become regionally concentrated, although though there may be regional concentrations of focus within the broad set of implementation choices provided by RISC-V,” says Bluespec’s Hobbs. “A prime example of this is the Barcelona Supercomputer Center, creating a regional focus area for high-performance computing using RISC-V. However, while there may be regional focus areas, this does not mean that the RISC-V standard is, or will become, fragmented. In fact, one of the key tenets of the creation and foundation of RISC-V was preventing fragmentation of the ISA, and it continues to be a key function of RISC-V international.”

China may be a different story. “A lot of companies in China are creating RISC-V cores for internal consumption — for political reasons mostly,” says John Min, vice president of customer service at Arteris. “I think China will go 100% RISC-V for embedded, but it’s a one-way street. They will keep leveraging what the Western companies do and enhance it. China will continue sucking all advancements, such as vectorization, or the special domain-specific acceleration enhancements. They will create their own and make it their own internally, but they will give nothing back.”

Such splits have occurred in the past. “Design languages are the most recent example of that,” says Hand. “There was a regional split, and you had Europe focus on VHDL while America went with Verilog. With RISC-V, there will be that regional split where people will go off and do their things regionally. Europe has focused projects, India has theirs, but they’re still doing it within this framework. It’s this realization that everyone benefits. They’re not doing it to benefit the other people. They’re doing it ultimately to save themselves effort, to save themselves cost, but they realize that by doing it in that framework it is a net benefit to everyone.”

Bi-directionality
An important element is that everyone benefits, and that has to stretch across the academic/commercial boundary. “RISC-V has propelled a new degree of collaboration between academia and commercial organizations,” says Dave Kelf, CEO at Breker. “It’s noticeable that institutions such as Harvey Mudd College in Claremont, California, and ETH in Zurich, Switzerland, to name two, have produced advanced processor designs as a teaching aid, and have collaborated with multiple companies on their verification and design. This has been further advanced by OpenHW Group, which has made these designs accessible to the industry. This bi-directional collaboration benefits the tool providers to further enhance their offerings working on advanced, open devices, while also enabling academia to improve their designs to a commercial quality level. The virtuous circle created is essential if we are to see RISC-V established as a mainstream, industry-wide capability.”

Academia has a lot to offer in hardware advancement. “Researchers in universities are developing innovative new software and hardware to push the limits of RISC-V innovation,” says Dave Miller, head of corporate communications at SiFive. “Many of the RISC-V projects in academia are focused on optimizing performance and energy efficiency for AI workloads, and are open source so the entire ecosystem can benefit. Researchers are also actively contributing to RISC-V working groups to share their knowledge and collaborate with industry participants. These working groups are split evenly between representatives from APAC, Europe, and North America, all working together towards common goals.”

In many cases, industry is willing to fund such projects. “It makes it easy to have research topics that don’t need to boil the ocean,” says Hand. “If you’re a PhD student and you have a great idea, you can go do it. It’s easy for an industry partner to say, ‘I’ll sponsor that. That’s an interesting thing, and I am not required to allocate ridiculous amounts of money into an open-ended project. It’s like I can see the connection of how that research will go into a commercial product later.'”

This feeds back into academia. “The academics have been jumping on board with OpenHW,” says Wohlrab. “By taking their cores and productizing them, they get a chip back that could be shipped in high volume. Then they can do their research on a real commercial product and can see if their idea would fly in real life. They get real numbers and can see real figures for the benefits of a new branch predictor.”

It can also have a long-term benefit for tools. “There are areas where they want to collaborate with us, especially around security,” says Kiran Vittal, executive director for alliances marketing management at Synopsys. “They are building RISC-V based sub-systems using open-source RISC-V processors, and then academia wants to look at not only the AI part, but the security part. There are post-doc students or PhD students looking into using our tools to verify or to implement whatever they’re doing on security.”

That provides an incentive for EDA to offer better tools for use in universities. “Although there has always been collaboration between universities and the industry, where industry provides the universities with access to EDA tools, IP cores, etc., there’s often a bit of a lag,” says Siemens’ Eide. “In many situations (especially outside of the core area of a particular project), universities have access to older versions of the commercial solutions. If you for instance look at a new grad’s resume, where you in the past would see references to old tech, now you see a lot of references to relatively sophisticated use of RISC-V.”

Moving Forward
This collaboration needs to keep pushing forward. “We had an initiative to create a standardized interface for accelerators,” says Wohlrab. “RISC-V International standardized how to add custom instructions in the ISA, but there was no standard for the hardware interface. So we built this. It was a cool discussion. There were people from Silicon Labs, people from NXP, people from Thales, plus several startups. They all came together and asked, ‘How can we make it future proof and put the accelerators inside?'”

The application space for RISC-V is changing. “The big inflection point is Linux and Android,” says Arteris’ Min. “Android already has some support, but when both Android and Linux are really supported, it will change the mobile apps processor game. The number of designs will proliferate. The number of high-end designs will explode. It will take the whole industry to enable that because RISC-V companies are not big enough to create this by themselves. All the RISC-V companies are partners, because we enable this high-end design at the processor levels.”

That would deepen the software community’s engagement. “An embedded software developer needs to understand the underlying hardware if they want to run Linux on a RISC-V processor that uses custom instructions/ accelerators,” says Bluespec’s Hobbs. “To develop complex embedded hardware/software systems, both embedded software developers and embedded hardware developments must possess contextual understanding of the interoperability of hardware and software. The developer must understand how the customized processor is leveraging the custom instructions in hardware for Linux to efficiently manage and execute the accelerated workloads.”

This collaboration could reinvigorate research into EDA, as well. “With AI you can build predictive models,” says Hand. “Could that be used to identify the change effects from making an extension? What does that mean? There’s a cloud of influence — not directly gate-wise, because that immediately explodes — but perhaps based on test suites. ‘I know that something that touches that logic touches this downstream, which touches the rest of the design.’ That’s where AI plays a big role, and it is one of the interesting areas because in verification there are so many unknowns. When AI comes along, any guidance or any visibility that you can give is incredibly powerful. Even if it is not right 100% of the time, that’s okay, as long as it generates false negatives and not false positives.”

There is a great opportunity for EDA companies. “We collaborate with many of the open-source providers, with OpenHW group, with ETH in Zurich,” says Synopsys’ Vittal. “We want to promote our solutions when it comes to any processor design and you need standard tools like synthesis, place and route, simulation. But then there are also other kinds of unique solutions because RISC-V is so customizable, you can build your own custom instructions. You need something specific to verify these custom instructions and that’s why the Imperas golden models are important. We also collaborated with Bluespec to develop a verification methodology to take you through functional verification and debug.”

There are still some wrinkles to be worked out for customizations. “RISC-V gives us predictability,” says Hand. “We can create a compliance test suite, give you a processor optimization package if you’re on the implementation side. We can create analytics and testing solutions because we know what it’s going to look like. But for non-standard processors, it is effectively as a service, because everyone’s processor is a little bit different. The reason you see a lot of focus on the verification, from platform architecture exploration all the way through, is because if you change one little thing, such as an addressing mode, it impacts pretty much 100% of your processor verification. You’ve got to retest the whole processor. Most people aren’t set up like an Arm or an Intel with huge processor verification teams and the infrastructure, and so they need automation to do it for them.”

Conclusion
RISC-V has enabled the industry to create a framework for collaboration, which enables everyone to work together for selfish reasons. It is a symbiotic relationship that continues to build, and it is creating a wider sphere of influence over time.

“It’s unique in the modern era of semiconductor,” says Hand. “You have such a wide degree of collaboration, where you have processor manufacturers, the software industry leaders, EDA companies, all working on a common infrastructure.”

Related Reading
RISC-V Micro-Architectural Verification
Verifying a processor is much more than making sure the instructions work, but the industry is building from a limited knowledge base and few dedicated tools.
RISC-V Wants All Your Cores
It is not enough to want to dominate the world of CPUs. RISC-V has every core in its sights, and it’s starting to take steps to get there.

The post RISC-V Heralds New Era Of Cooperation appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Turbocharging Cost-Conscious SoCs With CacheJohn Min
    Some design teams creating system-on-chip (SoC) devices are fortunate to work with the latest and greatest technology nodes coupled with a largely unconstrained budget for acquiring intellectual property (IP) blocks from trusted third-party vendors. However, many engineers are not so privileged. For every “spare no expense” project, there are a thousand “do the best you can with a limited budget” counterparts. One way to squeeze the most performance out of lower-cost, earlier generation, mid-ran
     

Turbocharging Cost-Conscious SoCs With Cache

Od: John Min
30. Květen 2024 v 09:04

Some design teams creating system-on-chip (SoC) devices are fortunate to work with the latest and greatest technology nodes coupled with a largely unconstrained budget for acquiring intellectual property (IP) blocks from trusted third-party vendors. However, many engineers are not so privileged. For every “spare no expense” project, there are a thousand “do the best you can with a limited budget” counterparts.

One way to squeeze the most performance out of lower-cost, earlier generation, mid-range processor and accelerator cores is to employ the judicious application of caches.

Cutting costs

A simplified example of a typical cost-conscious SoC scenario is illustrated in figure 1. Although the SoC may be composed of many IPs, only three are shown here for clarity.

Fig. 1: Portion of a cost-conscious, non-cache-coherent SoC. (Source: Arteris)

The predominant technology for connecting the IPs inside an SoC is network-on-chip (NoC) interconnect IP. This may be thought of as an IP that spans the entire device. The example shown in figure 1 may be assumed to reflect a non-cache-coherent scenario. In this case, any coherency requirements will be handled in software.

Let’s assume the SoC’s clock is running at 1GHz. Suppose a central processing unit (CPU) based on a reduced instruction set computer (RISC) architecture running a typical instruction will consume a single clock cycle. However, access to external DRAM memory can take anywhere between 100 and 200 processor clock cycles (we’ll average this out to be 150 cycles for the purposes of this article). This means that if the CPU lacked a Level 1 (L1) cache and was connected directly to the DRAM via the NoC and DDR memory controller, each instruction would consume 150 processor clock cycles, resulting in a CPU utilization of only 1/150 = 0.67%.

This is why CPUs, along with some accelerators and other IPs, employ cache memories to increase processor utilization and application performance. The underlying premise upon which the cache concept is based is the principle of locality. The idea is that only a small amount of the main memory is being employed at any given time and that locations in that space are being accessed multiple times. Mainly due to loops, nested loops and subroutines, instructions and their associated data experience temporal, spatial and sequential locality. This means that once a block of instructions and data have been copied from the main memory into an IP’s cache, the IP will typically access them repeatedly.

Today’s high-end CPU IPs usually have a minimum of a Level 1 (L1) and Level 2 (L2) cache, and they often have a Level 3 (L3) cache. Also, some accelerator IPs, like graphics processing units (GPUs) often have their own internal caches. However, these latest-generation high-end IPs often have a 5X to 10X price compared to their previous-generation mid-range counterparts. As a result, as illustrated in figure 1, the CPU in a cost-conscious SoC may come equipped with only an L1 cache.

Let’s consider the CPU and its L1 cache in a little more depth. When the CPU requests something in its cache, the result is called a cache hit. Since the L1 cache typically runs at the same speed as the processor core, a cache hit will be processed in a single processor clock cycle. By comparison, if the requested data is not in the cache, the result, called a cache miss, will require access to the main memory, which will consume 150 processor clock cycles.

Now consider running 1,000,000 instructions. If the cache were large enough to contain the whole program, then this would consume only 1,000,000 clock cycles, resulting in a CPU efficiency of 1,000,000 instructions/1,000,000 clock cycles = 100%.

Unfortunately, the L1 cache in a mid-range CPU will typically be only 16KB to 64KB in size. If we assume a 95% cache hit rate, then 950,000 of our 1,000,000 instructions will take one processor clock cycle. The remaining 50,000 instructions will each consume 150 clock cycles. Thus, the CPU efficiency in this case can be calculated as 1,000,000/((950,000 * 1) + (50,000 * 150)) = ~12%.

Turbocharging performance

A cost-effective way of turbocharging the performance of a cost-conscious SoC is to add cache IPs. For example, CodaCache from Arteris is a configurable, standalone non-coherent cache IP. Each CodaCache instance can be up to 8MB in size, and multiple copies can be instantiated in the same SoC, as demonstrated in figure 2.

Fig. 2: Portion of a turbocharged, non-cache-coherent SoC. (Source: Arteris)

It is not the intention of this article to suggest that every IP should be equipped with a CodaCache. Figure 2 is intended only to provide examples of potential CodaCache deployments.

If a CodaCache instance is associated with an IP, it’s known as a dedicated cache (DC). Alternatively, if a CodaCache instance is associated with a DDR memory controller, it’s referred to as a last-level cache (LLC). A DC will accelerate the performance of the IP with which it is associated, while an LLC will enhance the performance of the entire SoC.

As an example of the type of performance boost we might expect, consider the CPU shown in figure 2. Let’s assume the CodaCache DC instance associated with this IP is running at half the processor speed and that any accesses to this cache consume 20 processor clock cycles. If we also assume a 95% cache hit rate for this DC, then—for 1,000,000 instructions—our overall CPU+L1+DC efficiency can be calculated as 1,000,000/((950,000 * 1) + (47,500 * 20) + (2,500 * 150)) = ~44%. That’s a performance boost of ~273%!

Conclusion

In the past, embedded programmers relished the challenge of squeezing the highest performance possible out of small processors with low clock speeds and limited memory resources. In fact, it was common for computer magazines to issue challenges to their readers along the lines of, “Who can perform task X on processor Y in the minimum number of clock cycles using the smallest amount of memory?”

Today, many SoC developers enjoy the challenge of squeezing the highest performance possible out of their designs, especially if they are constrained to use lower-performing mid-range IPs. Deploying CodaCache IPs as dedicated and last-level caches provides an affordable way for engineers to turbocharge their cost-conscious SoCs. To learn more about CodaCache from Arteris, visit arteris.com.

The post Turbocharging Cost-Conscious SoCs With Cache appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • AI-Powered Data Analytics To Revolutionize The Semiconductor IndustryReela Samuel
    In the age where data reigns supreme, the semiconductor industry stands on the cusp of revolutionary change, redefining complexity and productivity through a lens crafted by artificial intelligence (AI). The intersection of AI and the semiconductor industry is not merely an emerging trend—it is the fulcrum upon which the next generation of technological innovation balances. Semiconductor companies are facing a critical juncture where the burgeoning complexity of chip designs is outpacing the gro
     

AI-Powered Data Analytics To Revolutionize The Semiconductor Industry

30. Květen 2024 v 09:03

In the age where data reigns supreme, the semiconductor industry stands on the cusp of revolutionary change, redefining complexity and productivity through a lens crafted by artificial intelligence (AI). The intersection of AI and the semiconductor industry is not merely an emerging trend—it is the fulcrum upon which the next generation of technological innovation balances. Semiconductor companies are facing a critical juncture where the burgeoning complexity of chip designs is outpacing the growth of skilled human resources. This is where the infusion of AI-powered data analytics catalyzes a seismic shift in the industry’s approach to efficiency and productivity.

AI in semiconductor design: A revolution beckons

With technological leaps like 5G, AI, and autonomous vehicles driving chip demand, the status quo for semiconductor design is no longer sustainable. Traditional design methodologies fall short in addressing the challenges presented by these new technologies, and the need for a new approach is non-negotiable. AI, with its capacity to process massive datasets and learn from patterns, offers a revolutionary solution. Gathered with vast amounts of electronic design automation (EDA) data, machine learning algorithms can pave an efficient path through design complexities.

Navigating the complexity of EDA data with AI

The core of semiconductor design lies in the complexity of EDA data, which is often disparate, unstructured, and immensely intricate, existing in various formats ranging from simple text to sophisticated binary machine-readable data. AI presents a beacon of hope in taming this beast by enabling the industry to store, process, and analyze data with unprecedented efficiency.

AI-enabled data analytics offer a path through the labyrinth of EDA complexity, providing a scalable and sophisticated data storage and processing solution. By harnessing AI’s capabilities, the semiconductor industry can dissect, organize, and distill data into actionable insights, elevating the efficacy of chip design processes.

Leveraging AI in design excellence

Informed decisions are the cornerstone of successful chip design, and the fusion of AI-driven analytics with semiconductor engineering marks a watershed moment in the industry. AI’s ability to comprehend and process unstructured data at scale enables a deeper understanding of design challenges, yielding solutions that optimize SoCs’ power, performance, and area (PPA).

AI models, fed by fragmented data points from EDA compilation, can predict bottlenecks, performance constraints, or power inefficiencies before they impede the design process. This foresight empowers engineers with informed design decisions, fostering an efficient and anticipatory design culture.

Reimagining engineering team efficiency

One of the most significant roadblocks in the semiconductor industry has been aligning designer resources with the exponential growth of chip demand. As designs become complex, they evolve into multifaceted systems on chips (SoCs) housing myriad hierarchical blocks that accumulate vast amounts of data throughout the iterative development cycle. When harnessed effectively, this data possesses untapped potential to elevate the efficiency of engineering teams.

Consolidating data review into a systematic, knowledge-driven process paves the way for accelerated design closure and seamless knowledge transfer between projects. This refined approach can significantly enhance the productivity of engineering teams, a crucial factor if the semiconductor industry is to meet the burgeoning chip demand without exponentially expanding design teams.

Ensuring a systemic AI integration

For the full potential of AI to be realized, a systemic integration across the semiconductor ecosystem is paramount. This integration spans the collection and storage of data and the development of AI models attuned to the industry’s specific needs. Robust AI infrastructure, equipped to handle the diverse data formats, is the cornerstone of this integration. AI models must complement it and be fine-tuned to the peculiarities of semiconductor design, ensuring that the insights they produce are accurate and actionable.

Cultivating AI competencies within engineering teams

As AI plays a central role in the semiconductor industry, it highlights the need for AI competencies within engineering teams. Upskilling the workforce to leverage AI tools and platforms is a critical step toward a harmonized AI ecosystem. This journey toward proficiency entails familiarization with AI concepts and a collaborative approach that blends domain expertise with AI acumen. Engineering teams adept at harnessing AI can unlock its full potential and become pioneers of innovation in the semiconductor landscape.

Intelligent system design

At Cadence, the conception of technological ecosystems is encapsulated within a framework of three concentric circles—a model neatly epitomized by the sophistication of an electric vehicle. The first circle represents the data used by the car; the second circle represents the physical car, including the mechanical, electrical, hardware, and software components. The third circle represents the silicon that powers the entire system.

The Cadence.AI Platform operates at the vanguard of pervasive intelligence, harnessing data and AI-driven analytics to propel system and silicon design to unprecedented levels of excellence. By deploying Cadence.AI, we converge our computational software innovations, from Cadence’s Verisium AI-Driven Verification Platform to the Cadence Cerebrus Intelligent Chip Explorer’s AI-driven implementation.

The AI-driven future of semiconductor innovation

The implications are far-reaching as the semiconductor industry charts its course into an AI-driven era. AI promises to redefine design efficiency, expedite time to market, and pioneer new frontiers in chip innovation. The path forward demands a concerted effort to integrate AI seamlessly into the semiconductor fabric, cultivating an ecosystem primed for the challenges and opportunities ahead.

Semiconductor firms that champion AI adoption will set the standard for the industry’s evolution, carving a niche for themselves as pioneers of a new chip design and production paradigm. The future of semiconductor innovation is undoubtedly AI, and the time to embrace this transformative force is now.

Cadence is already at the forefront of this AI-led revolution. Our Cadence.AI Platform is a testament to AI’s power in redefining systems and silicon design. By enabling the concurrent creation of multiple designs, optimizing team productivity, and pioneering leaner design approaches, Cadence.AI illustrates the true potential of AI in semiconductor innovation.

The harmonized suite of our AI tools equips our customers with the ability to employ AI-driven optimization and debugging, facilitating the concurrent creation of multiple designs while optimizing the productivity of engineering teams. It empowers a leaner workforce to achieve more, elevating their capability to generate a spectrum of designs in parallel with unmatched efficiency and precision, resulting in a new frontier in design excellence, where AI acts as a co-pilot to the engineering team, steering the way to unparalleled chip performance. Learn more about the power of AI to forge intelligent designs.

The post AI-Powered Data Analytics To Revolutionize The Semiconductor Industry appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • AI For Data ManagementAdam Kovac
    Data management is becoming a significant new challenge for the chip industry, as well as a brand new opportunity, as the amount of data collected at every step of design through manufacturing continues to grow. Exacerbating the problem is the rising complexity of designs, many of which are highly customized and domain-specific at the leading edge, as well as increasing demands for reliability and traceability. There also is a growing focus on chiplets developed using different processes, includ
     

AI For Data Management

30. Květen 2024 v 09:03

Data management is becoming a significant new challenge for the chip industry, as well as a brand new opportunity, as the amount of data collected at every step of design through manufacturing continues to grow.

Exacerbating the problem is the rising complexity of designs, many of which are highly customized and domain-specific at the leading edge, as well as increasing demands for reliability and traceability. There also is a growing focus on chiplets developed using different processes, including some from different foundries, and new materials such as glass substrates and ruthenium interconnects. On the design side, EDA and verification tools can generate terabytes of data on a weekly or even a daily basis, unlike in the past when this was largely done on a per-project basis.

While more data can be used to provide insights into processes and enable better designs, it’s an ongoing challenge to manage the current volumes being generated. The entire industry must rethink some well-proven methodologies and processes, as well as invest in a variety of new tools and approaches. At the same time, these changes are generating concern in an industry used to proceeding cautiously, one step at a time, based on silicon- and field-proven strategies. Increasingly, AI/ML is being added into design tools to identify anomalies and patterns in large data sets, and many of those tools are being regularly updated as algorithms are updated and new features are added, making it difficult to know exactly when and where to invest, which data to focus on, and with whom to share it.

“Every company has its own design flow, and almost every company has its own methodology around harvesting that data, or best practices about what reports should or should not be written out at what point,” said Rob Knoth, product management director in Cadence’s Digital & Signoff group. “There’s a death by 1,000 cuts that can happen in terms of just generating titanic volumes of data because, in general, disk space is cheap. People don’t think about it a lot, and they’ll just keep generating reports. The problem is that just because you’re generating reports doesn’t mean you’re using them.”

Fig. 1: Rising design complexity is driving increased need for data management. Source: IEEE Rising Stars 2022/Cadence

As with any problem in chip design, there is opportunity in figuring out a path forward. “You can always just not use the data, and then you’re back where you started,” said Tony Chan Carusone, CTO at Alphawave Semi. “The reason it becomes a problem for organizations is because they haven’t architected things from the beginning to be scalable, and therefore, to be able to handle all this data. Now, there’s an opportunity to leverage data, and it’s a different way. So it’s disruptive because you have to tear things apart, from re-architecting systems and processes to how you collect and store data, and organize it in order to take advantage of the opportunity.”

Buckets of data, buckets of problems
The challenges that come with this influx of data can be divided into three buckets, said Jim Schultz, senior staff product manager at Synopsys. The first is figuring out what information is actually critical to keep. “If you make a run, designers tend to save that run because if they need to do a follow up run, they have some data there and they may go, ‘Okay, well, what’s the runtime? How long did that run take, because my manager is going to ask me what I think the runtime is going to be on the next project or the next iteration of the block. While that data may not be necessary, designers and engineers have a tendency to hang onto it anyway, just in case.”

The second challenge is that once the data starts to pour in, it doesn’t stop, raising questions about how to manage collection. And third, once the data is collected, how can it be put to best use?

“Data analytics have been around with other types of companies exploring different types of data analytics, but the differences are those are can be very generic solutions,” said Schultz. “What we need for our industry is going to be very specific data analytics. If I have a timing issue, I want you to help me pinpoint what the cause of that timing violation is. That’s very specific to what we do in EDA. When we talk about who is cutting through the noise, we don’t want data that’s just presented. We want the data that is what the designer most cares about.”

Data security
The sheer number of tools being used and companies and people involved along the design pathway raises another challenge — security.

“There’s a lot of thought and investment going into the security aspect of data, and just as much as the problem of what data to save and store is the type of security we have to have without hindering the user day-to-day,” said Simon Rance, director of product management at Keysight. “That’s becoming a bigger challenge. Things like the CHIPS Act and the geopolitical scenarios we have at the moment are compounding that problem because a lot of the companies that used to create all these devices by themselves are having to collaborate, even with companies in different regions of the globe.”

This requires a balancing act. “It’s almost like a recording studio where you have all these knobs and dials to fine tune it, to make sure we have security of the data,” said Rance. “But we’re also able to get the job done as smoothly and as easily as we can.”

Further complicating the security aspect is that designing chips is not a one-man job. As leading-edge chips become increasingly complex and heterogeneous, they can involve hundreds of people in multiple companies.

“An important thing to consider when you’re talking about big data and analytics is what you’re going to share and with whom you’re going to share it,” said Synopsys’ Schultz. “In particular, when you start bringing in and linking data from different sources, if you start bringing in data related to silicon performance, you don’t want everybody to have access to that data. So the whole security protocol is important.”

Even the mundane matters — having a ton of data makes it likely, at some point, that data will be moved.

“The more places the data has to be transferred to, the more delays,” said Rance. “The bigger the data set, the longer it takes to go from A to B. For example, a design team in the U.S. may be designing during the day. Then, another team in Singapore or Japan will pick up on that design in their time zone, but they’re across the world. So you’re going to have to sync the data back and forth between these kinds of design sites. The bigger the data, the harder to sync.”

Solutions
The first step toward solving the issue of too much data is figuring out what data is actually needed. Rance said his team has found success using smart algorithms that help figure out which data is essential, which in turn can help optimize storage and transfer times.

There are less technical problems that can rear their heads, as well. Gina Jacobs, head of global communications and brand marketing at Arteris, said that engineers who use a set methodology — particularly those who are used to working on a problem by themselves and “brute forcing” a solution – also can find themselves overwhelmed by data.

“Engineers and designers can also switch jobs, taking with them institutional knowledge,” Jacobs said. “But all three problems can be solved with a single solution — having data stored in a standardized way that is easily accessible and sortable. It’s about taking data and requirements and specifications in different forms and then having it in the one place so that the different teams have access to it, and then being able to make changes so there is a single source of truth.”

Here, EDA design and data management tools are increasingly relying on artificial intelligence to help. Schultz forecasted a future where generative AI will touch every facet of chip development. “Along with that is the advanced data analytics that is able to mine all of that data you’ve been collecting, instead of going beyond the simple things that people have been doing, like predicting how long runtime is going to be or getting an idea what the performance is going to be,” he said. “Tools are going to be able to deal with all of that data and recognize trends much faster.”

Still, those all-encompassing AI tools, capable of complex analysis, are still years away. Cadence’s Knoth said he’s already encountered clients that are reluctant to bring it into the mix due to fears over the costs involved in disk space, compute resources, and licenses. Others, however, have been a bit more open-minded.

“Initially, AI can use a lot of processors to generate a lot of data because it’s doing a lot of things in parallel when it’s doing the inferencing, but it usually gets to the result faster and more predictably,” he said. So while a machine learning algorithm may generate even more vast amounts of data, on top of the piles currently available, “a good machine learning algorithm could be watching and smartly killing or restarting jobs where needed.”

As for the humans who are still an essential component to chip design, Alphawave’s Carusone said hardware engineers should take a page from lessons learned years ago from their counterparts in the software development world.

These include:

  • Having an organized and automated way to collect data, file it in a repository, and not do anything manually;
  • Developing ways to run verification and lab testing and everything in between in parallel, but with the data organized in a way that can be mined; and
  • Creating methods for rigorously checking in and out of different test cases that you want to consider.

“The big thing is you’ve got all this data collected, but then what is each of each of those files, each of those collections of data?” said Carusone. “What does that correspond to? What test conditions was that collected in? The software community dealt with that a while ago, and the hardware community also needs to have this under its belt, taking it to the next level and recognizing we really need to be able to do this en masse. We need to be able to have dozens of people work in parallel, collecting data and have it all on there. We can test a big collection of our designs in the lab without anyone having to touch a thing, and then also try refinements of the firmware, scale them out, then have all the data come in and be analyzed. Being able to have all that done in an automated way lets you track down and fix problems a lot more quickly.”

Conclusion
The influx of new tools used to analyze and test chip designs has increased productivity, but those designs come with additional considerations. Institutions and individual engineers and designers have never had access to so much data, but that data is of limited value if it’s not used effectively.

Strategies to properly store and order that data are essential. Some powerful tools are already in place to help do that, and the AI revolution promises to make even more powerful resources available to quickly cut down on the time needed to run tests and analyze the results.

For now, handling all that data remains a tricky balance, according to Cadence’s Knoth. “If this was an easy problem, it wouldn’t be a problem. Being able to communicate effectively, hierarchically — not just from a people management perspective, but also hierarchically from a chip and project management perspective — is difficult. The teams that do this well invest resources into that process, specifically the communication of top-down tightening of budgets or top-down floorplan constraints. These are important to think about because every engineer is looking at chip-level timing reports, but the problem that they’re trying to solve might not ever be visible. But if they have a report that says, ‘Here is your view of what your problems are to solve,’ you can make some very effective work.”

Further Reading
EDA Pushes Deeper Into AI
AI is both evolutionary and revolutionary, making it difficult to assess where and how it will be used, and what problems may crop up.
Optimizing EDA Cloud Hardware And Workloads
Algorithms written for GPUs can slice simulation time from weeks to hours, but not everything is optimized or benefits equally.

The post AI For Data Management appeared first on Semiconductor Engineering.

How do you decide to bound the player’s space in a 3D game? That is, when choosing between invisible walls, debris fields, cliffs, or simply an endless featureless plane/ocean, why pick one over the other?

30. Květen 2024 v 18:00

When we're trying to craft an experience for the player, it's almost never one big element that sets the experience. Instead, it's the combination of many small details and decisions that all pull in the same direction and combine into a single, cohesive experience. For example, let's say we wanted to convey that the player's character is feeling dehydrated while walking through a 3D space. What could we do to make the player feel that sensation? We might...

  • change the walk cycle so that the player is stumbling slowly instead of walking normally
  • have ambient audio of the character with ragged breathing or panting
  • have the environment be very parched - lots of brown colors. Place lots of rocks and sand, with any plant being withered or cracked with no leaves
  • add a heat shimmer post-processing effect to the screen
  • add a lens flare near the sun
  • add specks of dust and sand to any wind effects on screen
  • add a clothing shader to show sand/lightly colored dirt building up on the character's clothing

Now... in such a situation, what kind of bounding volumes would help convey this experience? Walking through the bottom of an empty ravine or canyon with high walls might do the trick. Walking across a narrow land bridge over a desert gorge could do as well. However, any water-related visuals would likely take away from the feeling of dehydration. Walking along the beach, even if it's a desert beach, just doesn't convey the same feeling as walking through a desert with sand as far as the eye can see.

Ultimately, all of the elements of environment art and level design are tools for us to craft the kind of experience we want. The choice of invisible wall, debris field, cliff, crate, railing, or whatever is up to the designers and artists trying to convey a specific experience. Each individual element only moves the needle a little bit, but having all of them together at once moves the needle a lot more.

[Join us on Discord] and/or [Support us on Patreon]

Got a burning question you want answered?

  • ✇Semiconductor Engineering
  • Blog Review: May 1Jesse Allen
    Cadence’s Vatsal Patel stresses the importance of having testing and training capabilities for high-bandwidth memory to prevent the entire SoC from becoming useless and points to key HBM DRAM test instructions through IEEE 1500. In a podcast, Siemens’ Stephen V. Chavez chats with Anaya Vardya of American Standard Circuits about the growing significance of high density interconnect and Ultra HDI technologies, which enable denser component placement and increased signal integrity compared to tradi
     

Blog Review: May 1

1. Květen 2024 v 09:01

Cadence’s Vatsal Patel stresses the importance of having testing and training capabilities for high-bandwidth memory to prevent the entire SoC from becoming useless and points to key HBM DRAM test instructions through IEEE 1500.

In a podcast, Siemens’ Stephen V. Chavez chats with Anaya Vardya of American Standard Circuits about the growing significance of high density interconnect and Ultra HDI technologies, which enable denser component placement and increased signal integrity compared to traditional PCB designs.

Synopsys’ Ian Land and Randy Fish find that silicon lifecycle management is increasingly being used on chips that target the aerospace and government market to ensure system health and longevity.

Arm’s Hristo Belchev looks at how to enable testing of system designs using the Memory Partitioning and Monitoring (MPAM) Arm architecture supplement, which allows privileged software to partition caches, memory controllers and interconnects on the hardware level.

Keysight’s Jonathon Wright considers where generative AI can add value in software testing by proposing a wide range of scenarios and improving communication between different stakeholders.

Ansys’ Laura Carter checks out how simulation is used to reduce the risks to drivers during a crash in stock car racing.

SEMI’s Maria Daniela Perez chats with Owen J. Guy of Swansea University about the challenge of onboarding talent within the microelectronics industry and the importance of ensuring students receive hands-on experience and exposure to real-world applications.

And don’t miss the blogs featured in the latest Systems & Design newsletter:

Technology Editor Brian Bailey suggests that although it is great to see the DAC conference come back to life, EDA companies need to do something about the show floor.

Siemens’ John Ferguson shows how to glean useful information well before all the details of an assembly are known.

Axiomise’ Ashish Darbari explains how formal verification can help improve chips.

Arteris’ Frank Schirrmeister tracks the race to centralized computing in automotive.

Synopsys’ Andrew Appleby explores the co-optimization of foundation IP and design flows for new transistors.

Cadence’s Anika Sunda looks at controlling the access to physical memory addresses.

Keysight’s Ben Coffin digs into how AI will be used in just about every subsystem of 6G networks.

The post Blog Review: May 1 appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Design Tool Think Tank RequiredBrian Bailey
    When I was in the EDA industry as a technologist, there were three main parts to my role. The first was to tell customers about new technologies being developed and tool extensions that would be appearing in the next release. These were features they might find beneficial both in the projects they were undertaking today, and even more so, would apply to future projects. Second, I would try and find out what new issues they were finding, or where the tools were not delivering the capabilities the
     

Design Tool Think Tank Required

29. Únor 2024 v 09:10

When I was in the EDA industry as a technologist, there were three main parts to my role. The first was to tell customers about new technologies being developed and tool extensions that would be appearing in the next release. These were features they might find beneficial both in the projects they were undertaking today, and even more so, would apply to future projects. Second, I would try and find out what new issues they were finding, or where the tools were not delivering the capabilities they required. This would feed into tool development planning. And finally, I would take those features selected by the marketing team for implementation and try to work out how best to implement them if it wasn’t obvious to the development teams.

By far the most difficult task out of the three was getting new requirements from customers. Most engineers have their heads down, concentrating on getting their latest chip out. When you ask them about new features, the only thing they offer are their current pain points. These usually involve incremental features or bugs, where the workaround is disliked, or insufficient performance.

Thirty years ago, when I first started doing that role, there were dedicated methodology groups within the larger companies whose job it was to develop flows and methodologies for future projects. This would appear to be the ideal people to ask, but in many cases they were so disconnected from the development team that what they asked for would never actually be used by the development team. These groups were idealists who wanted to instill revolutionary changes, whereas the development teams wanted evolutionary tools. The furthest many of those developments went was pilot projects that never became mainstream.

It seems as if the industry needs a better path to get requirements into the EDA companies. This used to be defined by the ITRS, which would look forward and project the new capabilities that would be required and the timeframes for them. That no longer exists. Today, standards are being driven by semiconductor companies. This is a change from the past, where we used to see the EDA companies driving the developments done within groups like Accellera. When I look at their recent undertakings, most of them are driven by end users.

Getting a standards group started today happens fairly late in the process. It implies an immediate need, but does not really allow time for solutions to be developed ahead of time. It appears that a think tank is required where the industry can discuss issues and problems for which new tool development is required. That can then be built into the EDA roadmaps so that the technology becomes available when it is needed.

One such area is power analysis. I have been writing stories about how important power and energy is becoming and may indeed soon become the limiter for many of the most complex designs. Some of the questions I always ask are:

  • What tools are being developed for doing power analysis of software?
  • How can you calculate the energy consumed for a given function?
  • How can users optimize a design for power or energy?

I rarely get straight answers to any of these questions. Instead, I’m often given vague ideas about how a user could do this in a manual fashion given the tools currently available.

I was beginning to think I was barking up the wrong tree and perhaps these were not legitimate concerns. My sanity was restored by a comment on one of my recent power related stories. Allan Cantle, OCP HPC Sub-Project Leader at Open Compute Project Foundation, wrote: “While it’s great to see articles like this highlight the need for us all to focus on energy centric computing, the sad news is that our tools don’t report energy in any obvious way to show the stupid architectural mistakes we often make from an energy consumption perspective. We are solving all the problems from a bottoms-up perspective by bringing things closer together. While that does bring tremendous energy efficiency benefits, it also creates massively increasing energy density. There is so much low-hanging fruit from a top-down system architecture approach that the industry is missing because we need to think outside the box and across our silos.”

Cantle went on to say: “A trivial improvement in tools that report energy consumption as a first-class metric will make it far easier for us to understand and rectify the mistakes we make as we build new energy-centric, domain-specific computers for each application. Alternatively, the silicon gods that rule our industry would be wise to take a step backward and think about the problem from a systems level perspective.”

I couldn’t agree more, and I find it frustrating that no EDA company seems to be listening. I am sure part of the problem is that the large customers are working on their own internal solutions, and they feel it will provide them with a competitive advantage. Until it becomes clear that all of their competitors have similar solutions, and that they no longer get an advantage from it, then they will look to transfer those solutions to the EDA companies so they do not have to maintain them. The EDA companies will then start to fight to make the solution they have acquired the standard. It all takes a long time.

In partial defense of the EDA companies, they are facing so many new issues these days that they are spread very thin dealing with new nodes, 2.5D, 3D, shift left, multi-physics, AI algorithms – to name just a few. They already spend more on R&D than most technology companies as a percentage of revenue.

Perhaps Accellera could start to include discussion forums in events like DVCon. This would allow for an open discussion about the problems they need to have solved. Perhaps they could start to produce the EDA equivalent of the old ITRS roadmap. It sure would save a lot of time and energy (pun intended).

The post Design Tool Think Tank Required appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • 2.5D Integration: Big Chip Or Small PCB?Brian Bailey
    Defining whether a 2.5D device is a printed circuit board shrunk down to fit into a package, or is a chip that extends beyond the limits of a single die, may seem like hair-splitting semantics, but it can have significant consequences for the overall success of a design. Planar chips always have been limited by size of the reticle, which is about 858mm2. Beyond that, yield issues make the silicon uneconomical. For years, that has limited the number of features that could be crammed onto a planar
     

2.5D Integration: Big Chip Or Small PCB?

29. Únor 2024 v 09:08

Defining whether a 2.5D device is a printed circuit board shrunk down to fit into a package, or is a chip that extends beyond the limits of a single die, may seem like hair-splitting semantics, but it can have significant consequences for the overall success of a design.

Planar chips always have been limited by size of the reticle, which is about 858mm2. Beyond that, yield issues make the silicon uneconomical. For years, that has limited the number of features that could be crammed onto a planar substrate. Any additional features would need to be designed into additional chips and connected with a printed circuit board (PCB).

The advent of 2.5D packaging technology has opened up a whole new axis for expansion, allowing multiple chiplets to be interconnected inside an advanced package. But the starting point for this packaged design can have a big impact on how the various components are assembled, who is involved, and which tools are deployed and when.

There are several reasons why 2.5D is gaining ground today. One is cost. “If you can build smaller chips, or chiplets, and those chiplets have been designed and optimized to be integrated into a package, it can make the whole thing smaller,” says Tony Mastroianni, advanced packaging solutions director at Siemens Digital Industries Software. “And because the yield is much higher, that has a dramatic impact on cost. Rather than having 50% or below yield for die-sized chips, you can get that up into the 90% range.”

Interconnecting chips using a PCB also limits performance. “Historically, we had chips packaged separately, put on the PCB, and connected with some routing,” says Ramin Farjadrad, CEO and co-founder of Eliyan. “The problems people started to face were twofold. One was that the bandwidth between these chips was limited by going through the PCB, and then a limited number of balls on the package limited the connectivity between these chips.”

The key difference with 2.5D compared to a PCB is that 2.5D uses chip dimensions. There are much finer-grain wires, and various components can be packed much closer together on an interposer or in a package than on a board. For those reasons, wires can be shorter, there can be more of them, and bandwidth is increased.

That impacts performance at multiple levels. “Since they are so close, you don’t have the long transport RC or LC delays, so it’s much faster,” says Siemens’ Mastroianni. “You don’t need big drivers on a chip to drive long traces over the board, so you have lower power. You get orders of magnitude better performance — and lower power. A common metric is to talk about pico joules per bit. The amount of energy it takes to move bits makes 2.5D compelling.”

Still, the mindset affects the initial design concept, and that has repercussions throughout the flow. “If you talk to a die designer, they’re probably going to say that it is just a big chip,” says John Park, product management group director in the Custom IC & PCB Group at Cadence. “But if you talk to a package designer, or a board designer, they’re going to say it’s basically a tiny PCB.”

Who is right? “The internal organizational structure within the company often decides how this is approached,” says Marc Swinnen, director of product marketing at Ansys. “Longer term, you want to make sure that your company is structured to match the physics and not try to match the physics to your company.”

What is clear is that nothing is certain. “The digital world was very regular in that every two years we got a new node that was half size,” says Cadence’s Park. “There would be some new requirements, but it was very evolutionary. Packaging is the Wild West. We might get 8 new packaging technologies this year, 3 next year, 12 the next year. Many of these are coming from the foundries, whereas it used to be just from the outsourced semiconductor assembly and test companies (OSATs) and the substrate providers. While the foundries are a new entrant, the OSATs are offering some really interesting packaging technologies at a lower cost.”

Part of the reason for this is that different groups of people have different requirement sets. “The government and the military see the primary benefits as heterogeneous integration capabilities,” says Ansys’ Swinnen. “They are not pushing the edge of processing technology. Instead, they are designing things like monolithic microwave integrated circuits (MMICs), where they need waveguides for very high-speed signals. They approach it from a packaging assembly point of view. Conversely, the high-performance compute (HPC) companies approach it from a pile of 5nm and 3nm chips with high performance high-bandwidth memory (HBM). They see it as a silicon assembly problem. The benefit they see is the flexibility of the architecture, where they can throw in cores and interfaces and create products for specific markets without having to redesign each chiplet. They see flexibility as the benefit. Military sees heterogeneous integration as the benefit.”

Materials
There are several materials used as the substrate in 2.5D packaging technology, each of which has different tradeoffs in terms of cost, density, and bandwidth, along with each having a selection of different physical issues that must be overcome. One of the primary points of differentiation is the bump pitch, as shown in figure 1.

Fig 1. Chiplet interconnection for various substrate configurations. Source: Eliyan

Fig 1. Chiplet interconnection for various substrate configurations. Source: Eliyan

When talking about an interposer, it generally is considered to be silicon. “The interposer could be a large piece of silicon (Fig 1 top), or just silicon bridges between the chips (Fig 1 middle) to provide the connectivity,” says Eliyan’s Farjadrad. “Both of these solutions use micro-bumps, which have high density. Interposers and bridges provide a lot of high-density bumps and traces, and that gives you bandwidth. If you utilize 1,000 wires each running at 5Gb, you get 5Tb. If you have 10,000, you get 50Tb. But those signals cannot go more than two or three millimeters. Alternatively, if you avoid the silicon interposer and you stay with an organic package (Fig 1 bottom), such as flip chip package, the density of the traces is 5X to 10X less. However, the thickness of the wires can be 5X to 10X more. That’s a significant advantage, because the resistance of the wires will go down by the square of the thickness of the wires. The cross section of that wire goes up by the square of that wire, so the resistance comes down significantly. If it’s 5X less density, that means you can run signals almost 25X further.”

For some people, it is all about bandwidth per millimeter. “If you have a parallel bus, or a parallel interface that is high speed, and you want bandwidth per millimeter, then you would probably pick a silicon interposer,” says Kent Stahn, senior manager of hardware engineering in Synopsys‘ Solutions Group. “An organic substrate is low-loss, low-cost, but it doesn’t have the density. In between, there are a bunch of solutions that deliver on some of that, but not for the same cost.”

There are other reasons to pick a substrate material, as well. “Silicon interposer comes from a foundry, so availability is a problem,” says Manuel Mota, senior staff product manager in Synopsys’ Solutions Group. “Some companies are facing challenges in sourcing advanced packages because capacity is taken. By going to other technologies that have a little less bandwidth density, but perhaps enough for your application, you can find them elsewhere. That’s becoming a critical aspect.”

All of these technologies are progressing rapidly, however. “The reticle limit is about 858mm square,” says Park. “People are talking about interposers that are perhaps four times that size, but we have laminates that go much bigger. Some of the laminate substrates coming from Japan are approaching that same level of interconnect density that we can get from silicon. I personally see more push towards organic substrates. Chip-on-Wafer-on-Substrate (CoWoS) from TSMC uses a silicon interposer and has been the technology of choice for about 12 years. More recently they introduced CoWoS-R, which uses film polyamide, closer to an organic type of substrate. Now we hear a lot about glass substrates.”

Over time, the total real estate inside the package may grow. “It doesn’t make sense for foundries to continue to build things the size of a 30-inch printed circuit board,” adds Park. “There are materials that are capable of addressing the bigger designs. Where we really need density is die-to-die. We want those chiplets right next to each other, a couple of millimeters of interconnect length. We want things very short. But the rest of it is just fanning out the I/O so that it connects to the PCB.”

This is why bridges are popular. “We do see a progression to bridges for the high-speed part of the interface,” say Synopsys’ Stahn. “The back side of it would be fanout, like RDL fanout. We see RDL packages that are going to be more like traditional packages going forward.”

Interposers offer additional capabilities. “Today, 99% of the interposers are passive,” says Park. “There’s no front end of line, there are no device layers. It’s purely back end of line processing. You are adding three, four, five metal layers to that silicon. That’s what we call a passive interposer. It’s just creating that die-to-die interconnect. But there are people taking that die and making it an active interposer, basically adding logic to that.”

That can happen for different purposes. “You already see some companies doing active interposers, where they add power management or some of the controls logic,” says Mota. “When you start putting active circuits on interposer, is it still a 2.5D integration, or does it become a 3D integration? We don’t see a big trend toward active interposers today.”

There are some new issues, though. “You have to consider coefficients of thermal expansion (CTE) mismatches,” says Stahn. “This happens whenever two materials with different CTEs are bonded together. Let’s start with the silicon interposer. You can get higher wattage systems, where the SoCs can be talking to their peers, and that can consume a lot of power. A silicon interposer still has to go in a package. The CTE mismatches are between the silicon to the package material. And with the bridge, you’re using it where you need it, but it’s still silicon die-to-die. You have to do the thermal mechanical analysis to make sure that the power that you’re delivering, and the CTE mismatches that you have, result in a viable system.”

While signal lengths in theory can get longer, this poses some problems. “When you’re making those long connections inside a chip, you typically limit those routes to a couple of millimeters, and then you buffer it,” says Mastroianni. “The problem with a passive silicon interposer is there are no buffers. That can really become a serious issue. If you do need to make those connections, you need to plan those out very carefully. And you do need to make sure you’re running timing analysis. Typically, your package guys are not going to be doing that analysis. That’s more of a problem that’s been solved with static timing analysis by silicon engineers. We do need to introduce an STA flow and deal with all the extractions that include organic and silicon type traces, and it becomes a new problem. When you start getting into some of those very long traces, your simple RC timing delays, which are assumed in normal STA delay calculators, don’t account for some of the inductance and mutual inductance between those traces, so you can get serious accuracy issues for those long traces.”

Active interposers help. “With active interposers, you can overcome some of the long-distance problems by putting in buffers or signal repeaters,” says Swinnen. “Then it starts looking more like a chip again, and you can only do it on silicon. You have the EMIB technology from Intel, where they embedded chiplet into the interposer and that’s an active bridge. The chip talks to the EMIB chip, and they both talk to you through this little active bridge chip, which is not exactly an active interposer, but acts almost like an active interposer.”

But even passive components add value. “The first thing that’s being done is including trench capacitors in the interposer,” says Mastroianni. “That gives you the ability to do some good decoupling, where it counts, close to the die. If you put them out on the board, you lose a lot of the benefits for the high-speed interfaces. If you can get them in the interposer, sitting right under where you have the fast-switching speed signals, you can get some localized decoupling.”

In addition to different materials, there is the question of who designs the interposer. “The industry seems to think of it as a little PCB in the context of who’s doing the design,” says Matt Commens, senior manager for product management at Ansys. “The interposers are typically being designed by packaging engineers, even though they are silicon processes. This is especially true for the high-performance ones. It seems counterintuitive, but they have that signal integrity background, they’ve been designing transmission lines and minimizing mismatch at interconnects. A traditional IC designer works from a component point of view. So certainly, the industry is telling us that the people they’re assigning to do that design work are packaging type of personas.”

Power
There are some considerable differences in routing between PCBs and interposers. “Interposer routing is much easier, as the number of components is drastically reduced compared to the PCB,” says Andy Heinig, head of department for efficient electronics at Fraunhofer IIS/EAS. “On the other hand, the power grid on the interposer is much more complex due to the higher resistance of the metal layers and the fact that the power grid is cut out by signal wires. The routing for the die-to-die interface is more complex due to the routing density.”

Power delivery looks very different. “If you look at a PCB, they put these big metal pour areas embedded in the layers, and they void out areas where things need to go through,” says Park. “You put down a bunch of copper and then you void out the others. We can’t build an interposer that way. We have to deposit the interconnect, so the power and ground structures on a silicon interposer will look more like a digital chip. But the signal will look more like a PCB or laminate package.”

Routing does look more like a PCB than a chip. “You’ll see things like teardrops or fillets where it makes a connection to a pad or via to create better yield,” adds Park. “The routing styles today are more aligned to PCBs than they are to a digital IC, where you just have 90° orthogonal corners and clean routing channels. For interposers, whether it’s silicon or organic, the via is often bigger than the wire, which is a classic PCB problem. The routers, if we’re talking about digital, is again more like a small PCB than a die.”

TSVs can create problems, too. “If you’re going to treat them as square, you’re losing a lot of space at the corners,” says Swinnen. “You really want 45° around those objects. Silicon routers are traditionally Manhattan, although there has been a long tradition of RDL routing, which is the top layer where the bumps are connected. That has traditionally used octagonal bumps or round bumps, and then 45° routing. It’s not as flexible as the PCB routing, but they have redistribution layer routers, and also they have some routers that come from the full custom side which have full river routing.”

Related Reading
True 3D Is Much Tougher Than 2.5D
While terms often are used interchangeably, they are very different technologies with different challenges.
Thermal Integrity Challenges Grow In 2.5D
Work is underway to map heat flows in interposer-based designs, but there’s much more to be done.

The post 2.5D Integration: Big Chip Or Small PCB? appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Commercial Chiplet Ecosystem May Be A Decade AwayAnn Mutschler
    Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packagi
     

Commercial Chiplet Ecosystem May Be A Decade Away

29. Únor 2024 v 09:08

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

L-R: Arteris’ Schirrmeister, Cadence’s Bhatnagar, Expedera’s Karazuba, Keysight’s Slater, Siemens EDA’s Rinebold, and Synopsys’ Posner.

SE: There’s a lot of buzz and activity around every aspect of chiplets today. What is your impression of where the commercial chiplet ecosystem stands today?

Schirrmeister: There’s a lot of interest today in an open chiplet ecosystem, but we are probably still quite a bit away from true openness. The proprietary versions of chiplets are alive and kicking out there. We see them in designs. We vendors are all supporting those to make it reality, like the UCIe proponents, but it will take some time to get to a fully open ecosystem. It’s probably at least three to five years before we get to a PCI Express type exchange environment.

Bhatnagar: The commercial chiplet ecosystem is at a very early stage. Many companies are providing chiplets, are designing them, and they’re shipping products — but they’re still single-vendor products, where the same company is designing all the pieces. I hope that with the advancements the UCIe standard is making, and with more standardization, we eventually can get to a marketplace-like environment for chiplets. We are not there.

Karazuba: The commercialization of homogeneous chiplets is pretty well understood by groups like AMD. But for the commercialization of heterogeneous chiplets, which is chiplets from multiple suppliers, there are still a lot of questions out there about that.

Slater: We participate in a lot of the board discussions, and attend industry events like TSMC’s OIP, and there’s a lot of excitement out there at the moment. I see a lot of even midsize and small customers starting to think about their development plans for what chiplet should be. I do think those that are going to be successful first will be those that are within a singular foundry ecosystem like TSMC’s. Today if you’re selecting your IP, you’ve got a variety of ways to be able to pick and choose which IP, see what’s been taped out before, how successful it’s been so you have a way to manage your risk and your costs as you’re putting things together. What we’ll see in the future will be that now you have a choice. Are you purchasing IP, or are you purchasing chiplets? Crucially, it’s all coming from the same foundry and put together in the same manner. The technical considerations of things like UCIe standard packaging versus advanced packaging, and the analysis tool sets for high-speed simulation, as well as for things like thermal, are going to just become that much more important.

Rinebold: I’ve been doing this about 30 years, so I can date back to some of the very earliest days of multi-chip modules and such. When we talk about the ecosystem, there are plenty of examples out there today where we see HBM and logic getting combined at the interposer level. This works if you believe HBM is a chiplet, and that’s a whole other argument. Some would argue that HBM falls into that category. The idea of a true LEGO, snap-together mix and match of chiplets continues to be aspirational for the mainstream market, but there are some business impediments that need to get addressed. Again, there are exceptions in some of the single-vendor solutions, where it’s more or less homogeneous integration, or an entirely vertically integrated type of environment where single vendors are integrating their own chiplets into some pretty impressive packages.

Posner: Aspirational is the word we use for an open ecosystem. I’m going to be a little bit more of a downer by saying I believe it’s 5 to 10 years out. Is it possible? Absolutely. But the biggest issue we see at the moment is a huge knowledge gap in what that really means. And as technology companies become more educated on really what that means, we’ll find that there will be some acceleration in adoption. But within what we call ‘captive’ — within a single company or a micro-ecosystem — we’re seeing multi-die systems pick up.

SE: Is it possible to define the pieces we have today from a technology point of view, to make a commercial chiplet ecosystem a reality?

Rinebold: What’s encouraging is the development of standards. There’s some adoption. We’ve already mentioned UCIe for some of the die-to-die protocols. Organizations like JEDEC announced the extension of their JEP30 PartModel format into the chiplet ecosystem to incorporate chiplet-style data. Think about this as an electronic data sheet. A lot of this work has been incorporated into the CDX working group under Open Compute. That’s encouraging. There were some comments a little bit earlier about having an open marketplace. I would agree we’re probably 3 to 10 years away from that coming to fruition. The underlying framework and infrastructure is there, but a lot of the licensing and distribution issues have to get resolved before you see any type of broad adoption.

Posner: The infrastructure is available. The EDA tools to create, to package, to analyze, to simulate, to manufacture — those tools are all there. The intellectual property that sits around it, either UCIe or some of the more traditional die-to-die interfaces, all of that’s there. What’s not established are full methodology and flows that lead to interoperability. Everything within captive is possible, but a broader ecosystem, a marketplace, is going to require silicon interoperability, simulation, packaging, all of that. That’s the area that we believe is missing — and still building.

Schirrmeister: Do we know what’s required? We probably can define that reasonably well. If the vision is an open ecosystem with IP on chiplets that you can just plug together like LEGO blocks, then the IP industry informs us of what’s required, and then there are some gaps on top of them. I hear people from the hard-coded IP world talking about the equivalent of PDKs for chiplets, but today’s IP ecosystem and the IP deliverables are informing us it doesn’t work like LEGO blocks yet. We are improving every year. But this whole, ‘I take my whiteboard and then everything just magically functions together’ is not what we have today. We need to think really hard about what the additional challenges are when you disaggregate that into chiplets and protocols. Then you get big systemic issues to deal with, like how do you deal with coherency across chiplets? It was challenging enough to get it done on a chip. Now you potentially have to deal with other partnerships you don’t even own. This is not a captive environment in an open ecosystem. That makes it very challenging, and it creates job security for at least 5 to 10 years.

Bhatnagar: On the technical side, what’s going well is adoption. We can see big companies like Intel, and then of course, IP providers like us and Synopsys. Everybody’s working toward standardizing chiplet integration, and that is working very well. EDA tools are also coming up to support that. But we are still very far from a marketplace because there are many issues that are not sorted out, like licensing and a few other things that need a bit more time.

Slater: The standards bodies and networking groups have excited a lot of people, and we’re getting a broad set of customers that are coming along. And one point I was thinking, is this only for very high-end compute? From the companies that I see presenting in those types of forums, it’s even companies working in automotive or aerospace/defense, planning out their future for the next 10 years or more. In the automotive case, it was a company that was thinking about creating chiplets for internal consumption — so maybe reorganizing how they look at creating many different variations or evolutions of their products, trying to do it as more modular chiplet types of blocks. ‘If we take the microprocessor part of it, would we sell that as a chiplet externally for other customers to integrate together into a bigger design?’ For me, the aha moment was seeing how broad the application would be. I do think that the standards work has been moving very fast, and that’s worked really well. For instance, at Keysight EDA, we just released a chiplet PHY designer. It’s a simulation for the high-speed digital link for UCIe, and that only comes about by having a standard that’s published, so an EDA company can take a look at it and realize what they need to do with it. The EDA tools are ready to handle these kinds of things. And maybe then, on to the last point is, in order to share the IP, in order to ensure that it’s available, database and process management is going to become all the more important. You need to keep track of which chip is made on which process, and be able to make it available inside the company to other potential users of that.

SE: What’s in place today from a business perspective, and what still needs to be worked out?

Karazuba: From a business perspective, speaking strictly of heterogeneous chiplets, I don’t think there’s anything really in place. Let me qualify that by asking, ‘Who is responsible for warranty? Who is responsible for testing? Who is responsible for faults? Who is responsible for supply chain?’ With homogeneous chiplets or monolithic silicon, that’s understood because that’s the way that this industry has been doing business since its inception. But when you talk about chiplets that are coming from multiple suppliers, with multiple IPs — and perhaps different interfaces, made in multiple fabs, then constructed by a third party, put together by a third party, tested by a fourth party, and then shipped — what happens when something goes wrong? Who do you point the finger at? Who do you go to and talk to? If a particular chiplet isn’t functioning as intended, it’s not necessarily that chiplet that’s bad. It may be the interface on another chiplet, or on a hub, whatever it might be. We’re going to get there, but right now that’s not understood. It’s not understood who is going to be responsible for things such as that. Is it the multi-chip module manufacturer, or is it the person buying it? I fear a return to the Wintel issue, where the chipmaker points to the OS maker, which points at the hardware maker, which points at the chipmaker. Understanding of the commercial side is is a real barrier to chiplets being adopted. Granted, the technical is much more difficult than the commercial, but I have no doubt the engineers will get there quicker than the business people.

Rinebold: I completely agree. What are the repercussions, warranty-related issues, things like that? I’d also go one step further. If you look at some of the larger silicon foundries right now, there is some concern about taking third-party wafers into their facilities to integrate in some type of heterogeneous, chiplet-type package. There are a lot of business and logistical issues that have to get addressed first. The technical stuff will happen quickly. It’s just a lot of these licensing- and distribution-type issues that need to get resolved. The other thing I want to back up to involves customers in the defense/industrial space. The trust and traceability and the province tracking of IP is going to be key for them, because they have so much expectation of multi-die or chiplet-type packaging as an alternative to monolithic scaling. Just look at all the government programs out there right now, with RESHAPE [Reshore Ecosystem for Secure Heterogeneous Advanced Packaging Electronics] and NGMM [Next-Generation Microelectronics Manufacturing] and such. They’re all in on this chiplet perspective, but they’re going to require a lot of security measures to understand who has touched the IP, where it comes from, how to you verify that.

Posner: Micro-ecosystems are forming because of all these challenges. If you naively think you can just go pick a die off the shelf and put it into your device, how do you warranty that? Who owns it? These micro-ecosystems are building up to fundamentally sort that out. So within a couple of different companies, be it automotive or high-performance compute, they’ll come to terms that are acceptable across all of them. And it’s these micro-ecosystems that are really going to end up driving open chiplets, and I think it’s going to be an organic type of growth. Chiplets are available for a specific application today, but we have this vision that someone else could use it, and we see that with the multiple modes being built into the dies. One mode is, ‘I’m connecting to myself. It’s a very tight, low-latency link.’ But I have this vision in the future that I’m going to need to have an interface or protocol that is more open and uses standard available stacks, and which can be bought off the shelf and integrated. That’s one side of the logistics. I want to mention two more things. It is possible to do interoperability across nodes. We demonstrated our TSMC N3 UCIe with Intel’s in-house UCIe, all put together on an Intel process. This was two separate companies working together, showing the first physical interoperability, so it’s possible. But going forward that’s still just a small part of the overall effort. In the IP space we’ve lived with an IP model of, ‘Build once, sell many.’ With the chiplet marketplace, unless there is a revenue stream from that chiplet, it will break that model. Companies think, ‘I only have to buy the IP once, and then I’m selling my silicon.’ But the infrastructure, the resources that are required to build all of this does not go away. There has to be money at the end of that tunnel for all of these different companies to be investing.

Schirrmeister: Mick is 100% right, but we may have a definition issue here with what we really mean by an ‘open’ chiplet ecosystem. I have two distinct conversations when I talk to partners and customers. On the one hand, you have board designers who are doing more and more integration, and they look at you with a wrinkled forehead and say, ‘We’ve been doing this for years. What are you talking about?’ It may not have been 3D-IC in the classic sense of all items, but they say, ‘Yeah, there are issues with warranties, and the user figures it out.’ The board people arrive from one side of the equation at chiplets because that’s the next evolution of integration. You need to be very efficient. That’s not what we call an open ecosystem of chiplets here. The idea is that you have this marketplace to mix things up, and you have the economies of scale by selling the same chiplet to multiple people. That’s really what the chip designers are thinking about, and some of them think even further because if you do it all in true 3D-IC fashion, then you actually have to co-design those chiplets in a way, and that’s a whole other dimension that needs to be sorted out. To pick a little bit on the big companies that have board and chip design groups in house, you see this even within the messaging of these companies. You have people who come from the board side, and for them it’s not a solved problem. It always has been challenging, but they’re going to take it to the next level. The chip guys are looking at this from a perspective of one interface, like PCI Express, now being UCIe. And then I think about this because the networks on chip need to become super NoCs across chiplets, which poses its own challenges. And that all needs to work together. But those are really chiplets designed for the purpose of being in a chiplet ecosystem. And to that end, Mick’s estimation of longer than five years is probably correct because those purpose-built chiplets, for the purpose of being in an open ecosystem, have all these challenges the board guys have already been dealing with for quite some time. They’re now ‘just getting smaller’ in the amount of integration they do.

Slater: When you put all these chiplets together and start to do that integration, in what order do you start placing the components down? You don’t want to throw away one very expensive chiplet because there was an issue with one of the smaller cheaper ones. So, there are now a lot of thoughts about how to go about doing almost like unit tests on individual chiplets first, but then you want to do some form of system test as you go along. That’s definitely something we need to think about. On the business front,  who is going to be most interested in purchasing a chiplet-style solution. It comes down to whether you have a yield problem. If your chips are getting to the size where you have yield concerns, then definitely it makes sense to think about using chiplets and breaking it up into smaller pieces. Not everything scales, so why move to the lowest process node when you could purchase something at a different process node that has less risk and costs less to manufacture, and then put it all together. The ones that have been successful so far — the big companies like Intel, AMD — were already driven to that edge. The chips got to a size that couldn’t fit on the reticle. We think about how many companies fit into that category, and that will factor into whether or not the cost and risk is worth it for them.

Bhatnagar: From a business perspective, what is really important is the standardization. Inside of the chiplets is fine, but how it impacts other chiplets around it is important. We would like to be able to make something and sell many copies of it. But if there is no standardization, then either we are taking a gamble by going for one thing and assuming everybody moves to it, or we make multiple versions of the same thing and that adds extra costs. To really justify a business case for any chiplet or, or any sort of IP with the chiplet, the standardization is key for the electrical interconnect, packaging, and all other aspects of a system.

Fig. 1:  A chiplet design. Source: Cadence. 

Related Reading
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.
Proprietary Vs. Commercial Chiplets
Who wins, who loses, and where are the big challenges for multi-vendor heterogeneous integration.

The post Commercial Chiplet Ecosystem May Be A Decade Away appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Brain-Inspired, Silicon OptimizedBarry Pangrle
    The 2024 International Solid State Circuits Conference was held this week in San Francisco. Submissions were up 40% and contributed to the quality of the papers accepted and the presentations given at the conference. The mood about the future of semiconductor technology was decidedly upbeat with predictions of a $1 trillion industry by 2030 and many expecting that the soaring demand for AI enabling silicon to speed up that timeline. Dr. Kevin Zhang, Senior Vice President, Business Development an
     

Brain-Inspired, Silicon Optimized

29. Únor 2024 v 09:06

The 2024 International Solid State Circuits Conference was held this week in San Francisco. Submissions were up 40% and contributed to the quality of the papers accepted and the presentations given at the conference.

The mood about the future of semiconductor technology was decidedly upbeat with predictions of a $1 trillion industry by 2030 and many expecting that the soaring demand for AI enabling silicon to speed up that timeline.

Dr. Kevin Zhang, Senior Vice President, Business Development and Overseas Operations Office for TSMC, showed the following slide during his opening plenary talk.

Fig. 1: TSMC semiconductor industry revenue forecast to 2030.

The 2030 semiconductor market by platform was broken out as 40% HPC, 30% Mobile, 15% Automotive, 10% IoT and 5% “Others”.

Dr. Zhang also outlined several new generations of transistor technologies, showing that there’s still more improvements to come.

Fig. 2: TSMC transistor architecture projected roadmap.

TSMC’s N2 will be going into production next year and is transitioning TSMC from finFET to nanosheet, and the figure still shows a next step of stacking NMOS and PMOS transistor to get increased density in silicon.

Lip Bu Tan, Chairman, Walden International, also backed up the $1T prediction.

Fig. 3: Walden semiconductor market drivers.

Mr. Tan also referenced an MIT paper from September 2023 titled, “AI Models are devouring energy. Tools to reduce consumption are here, if data centers will adopt.” It states that huge, popular models like ChatGPT signal a trend of large-scale AI, boosting some forecasts that predict data centers could draw up to 21% of the world’s electricity supply by 2030. That’s an astounding over 1/5 of the world’s electricity.

There also appears to be a virtuous cycle of using this new AI technology to create even better computing machines.

Fig. 4: Walden design productivity improvements.

The figure above shows a history of order of magnitude improvements in design productivity to help engineers make use of all the transistors that have been scaling with Moore’s Law. There are also advances in packaging and companies like AMD, Intel and Meta all presented papers of implementations using fine pitch hybrid bonding to build systems with even higher densities. Mr. Tan presented data attributed to market.us predicting that AI will drive a CAGR of 42% in 3D-IC chiplet growth between 2023 and 2033.

Jonah Alben, Senior Vice President of GPU Engineering for NVIDIA, further backed up the claim of generative AI enabling better productivity and better designs. Figure 5 below shows how NVIDIA was able to use their PrefixRL AI system to produce better designs along a whole design curve and stated that this technology was used to design nearly 13,000 circuits in NVIDIA’s Hopper.

There was also a Tuesday night panel session on generative AI for design, and the fairly recent Si Catalyst panel discussion held last November was covered here. This is definitely an area that is growing and gaining momentum.

Fig. 5: NVIDIA example improvements from PrefixRL.

To wrap up, let’s look at some work that’s been reporting best in class performance metrics in terms of efficiency, IBM’s NorthPole. Researchers at IBM published and presented the paper 11.4: “IBM NorthPole: An Architecture for Neural Network Inference with a 12nm Chip.” Last September after HotChips, the article IBM’s Energy-Efficient NorthPole AI Unit included many of the industry competition comparisons, so those won’t be included again here, but we will look at some of the other results that were reported.

The brain-inspired research team has been working for over a decade at IBM. In fact, in October 2014 their earlier spike-based research was reported in the article Brain-Inspired Power. Like many so-called asynchronous approaches, the information and communication overhead for the spikes meant that the energy efficiency didn’t pan out and the team re-thought how to best incorporate brain model concepts into silicon, hence the brain-inspired, silicon optimized tag line.

NorthPole makes use of what IBM refers to as near memory compute. As pointed out and shown here, the memory is tightly integrated with the compute blocks, which reduces how far data must travel and saves energy. As shown in figure 6, for ResNet-50 NorthPole is most efficient running at approximately 680mV and approximately 200MHz (in 12nm FinFET technology). This yields an energy metric of ~1100 frames/joule (equivalently fps/W).

Fig. 6: NorthPole voltage/frequency scaling results for ResNet-50.

To optimize the communication for NorthPole, IBM created 4 NoCs:

  • Partial Sum NoC (PSNoC) communicates within a layer – for spatial computing
  • Activation NoC (ANoC) reorganizes activations between layers
  • Model NoC (MNoC) delivers weights during layer execution
  • Instruction NoC (INoC) delivers the program for each layer prior to layer start

The Instruction and Model NoCs share the same architecture. The protocols are full-custom and optimized for 0 stall cycles and are 2-D meshes. The PSNoC is communicating across short distances and could be said to be NoC-ish. The ANoC is again its own custom protocol implementation. Along with using software to compile executables that are fully deterministic and perform no speculation and optimize the bit width of computations between 8-, 4- and 2-bit calculations, this all leads to a very efficient implementation.

Fig. 7: NorthPole exploded view of PCIe assembly.

IBM had a demonstration of NorthPole running at ISSCC. The unit is well designed for server use and the team is looking forward to the possibility of implementing NorthPole in a more advanced technology node. My thanks to John Arthur from IBM for taking some time to discuss NorthPole.

The post Brain-Inspired, Silicon Optimized appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Accellera Preps New Standard For Clock-Domain CrossingBrian Bailey
    Part of the hierarchical development flow is about to get a lot simpler, thanks to a new standard being created by Accellera. What is less clear is how long will it take before users see any benefit. At the register transfer level (RTL), when a data signal passes between two flip flops, it initially is assumed that clocks are perfect. After clock-tree synthesis and place-and-route are performed, there can be considerable timing skew between the clock edges arriving those adjacent flops. That mak
     

Accellera Preps New Standard For Clock-Domain Crossing

29. Únor 2024 v 09:06

Part of the hierarchical development flow is about to get a lot simpler, thanks to a new standard being created by Accellera. What is less clear is how long will it take before users see any benefit.

At the register transfer level (RTL), when a data signal passes between two flip flops, it initially is assumed that clocks are perfect. After clock-tree synthesis and place-and-route are performed, there can be considerable timing skew between the clock edges arriving those adjacent flops. That makes timing sign-off difficult, but at least the clocks are still synchronous.

But if the clocks come from different sources, are at different frequencies, or a design boundary exists between the flip flops — which would happen with the integration of IP blocks — it’s impossible to guarantee that no clock edges will arrive when the data is unstable. That can cause the output to become unknown for a period of time. This phenomenon, known as metastability, cannot be eliminated, and the verification of those boundaries is known as clock-domain crossing (CDC) analysis.

Special care is required on those boundaries. “You have to compensate for metastability by ensuring that the CDC crossings follow a specific set of logic design principles,” says Prakash Narain, president and CEO of Real Intent. “The general process in use today follows a hierarchical approach and requires that the clock-domain crossing internal to an IP is protected and safe. At the interface of the IP, where the system connects with the IP, two different teams share the problem. An IP provider may recommend an integration methodology, which often is captured in an abstraction model. That abstraction model enables the integration boundary to be verified while the internals of it will not be checked for CDC. That has already been verified.”

In the past, those abstract models differentiated the CDC solutions from veracious vendors. That’s no longer the case. Every IP and tool vendor has different formats, making it costly for everyone. “I don’t know that there’s really anything new or differentiating coming down the pipe for hierarchical modeling,” says Kevin Campbell, technical product manager at Siemens Digital Industries Software. “The creation of the standard will basically deliver much faster results with no loss of quality. I don’t know how much more you can differentiate in that space other than just with performance increases.”

While this has been a problem for the whole industry for quite some time, Intel decided it was time for a solution. The company pushed Accellera to take up the issue, and helped facilitate the creation of the standard by chairing the committee. “I’m going to describe three methods of building a product,” says Iredamola “Dammy” Olopade, chair of the Accellera working group, and a principal engineer at Intel. “Method number one is where you build everything in a monolithic fashion. You own every line of code, you know the architecture, you use the tool of your choice. That is a thing of the past. The second method uses some IP. It leverages reuse and enables the quick turnaround of new SoCs. There used to be a time when all IPs came from the same source, and those were integrating into a product. You could agree upon the tools. We are quickly moving to a world where I need to source IPs wherever I can get them. They don’t use the same tools as I do. In that world, common standards are critical to integrating quickly.”

In some cases, there is a hierarchy of IP. “Clock-domain crossings are a central part of our business,” says Frank Schirrmeister, vice president of solutions and business development at Arteris. “A network-on-chip (NoC) can be considered as ‘CDC central’ because most blocks connected to the NoC have different clocks. Also, our SoC integration tools see all of the blocks to be integrated, and those touch various clock domains and therefore need to deal with the CDC code that is inserted.”

This whole thing can become very messy. “While every solution supports hierarchical modeling, every tool has its own model solution and its own model representation,” says Siemens’ Campbell. “Vendors, or users, are stuck with a CDC solution, because the models were created within a certain solution. There’s no real transportability between any of the hierarchical modeling solutions unless they want to go regenerate models for another solution.”

That creates a lot of extra work. “Today, when dealing with customer CDC issues, we have to consider the customer’s specific environment, and for CDC, a potential mix of in-house flows and commercial tools from various vendors,” says Arteris’ Schirrmeister. “The compatibility matrix becomes very complex, very fast. If adopted, the new Accellera CDC standard bears the potential to make it easier for IP vendors, like us, to ensure compatibility and reduce the effort required to validate IP across multiple customer toolsets. The intent, as specified in the requirements is that ‘every IP provider can run its tool of choice to verify and produce collateral and generate the standard format for SoCs that use a different tool.'”

Everyone benefits. “IP providers will not need to provide extra documentation of clock domains for the SoC integrator to use in their CDC analysis,” says Ahmed Nasr, digital design manager at Mixel. “The standard CDC attributes generated by the EDA tool will be self-contained.”

The use model is relatively simple. “An IP developer signs off on CDC and then exports the abstract model,” says Real Intent’s Narain. “It is likely they will write this out in both the Accellera format and the native format to provide backward compatibility. At the next level of hierarchy, you read in the abstract model instead of reading in the full view of the design. They have various views of the IP, including the CDC view of the IP, which today is on the basis of whatever tool they use for CDC sign-off.”

The potential is significant. “If done right and adopted, the industry may arrive at a common language to describe CDC aspects that can streamline the validation process across various tools and environments used by different users,” says Schirrmeister. “As a result, companies will be able to integrate and validate IP more efficiently than before, accelerating development cycles and reducing the complexity associated with SoC integration.”

The standard
Intel’s Olopade describes the approach that was taken during the creation of the standard. “You take the most complex situations you are likely to find, you box them, and you co-design them in order to reduce the risk of bugs,” he said. “The boundaries you create are supposed to be simple boundaries. We took that concept, and we brought it into our definition to say the following: ‘We will look at all kinds of crossings, we will figure out the simple common uses, and we will cover that first.’ That is expected to cover 95% to 98% of the community. We are not trying to handle 700 different exceptions. It is common. It is simple. It is what guarantees production quality, not just from a CDC standpoint, but just from a divide-and-conquer standpoint.”

That was the starting point. “Then we added elements to our design document that says, ‘This is how we will evaluate complexity, and this is how we’ll determine what we cover first,'” he says. “We broke things down into three steps. Step one is clock-domain crossing. Everyone suffers from this problem. Step two is reset-domain crossing (RDC). As low power is getting into more designs, there are a lot more reset domains, and there is risk between these reset domains. Some companies care, but many companies don’t because they are not in a power-aware environment. It became a secondary consideration. Beyond the basic CDC in phase one, and RDC in phase two, all other interesting, small usage complexities will be handled in phase three as extensions to the standard. We are not going to get bogged down supporting everything under the sun.”

Within the standards group there are two sub-groups — a mapping team and a format team. Common standards, such as AMBA, UCIe, and PCIe have been looked at to make sure that these are fully covered by the standard. That means that the concepts should be useful for future markets.

“The concepts contained in the standard are extensible to hardened chiplets,” says Mixel’s Nasr. “By providing an accurate standard CDC view for the chiplet, it will enable integration with other chiplets.”

Some of those issues have yet to be fully explored. “The standard’s current documentation primarily focuses on clock-domain crossing within an SoC itself,” says Schirrmeister. “Its direct applicability to the area of chiplets would depend on further developments. The interfaces between fully hardened IP blocks on chiplets would communicate through standard interfaces like UCIe, BoW, or XSR, so the synchronization issues between chiplets on substrates would appear to be elevated to the protocol levels.”

Reset-domain crossings have yet to appear in the standard. “The genesis of CDC is asynchronous clocks,” says Narain. “But the genesis for reset-domain crossing is asynchronous resets. While the destination is due to the clock, the source of the problem is somewhere else. And as a result, the nature of the problem, the methodology that people use to manage that problem, are very different. The kind of information that you need to retain, and the kind of information that you can throw away, is different for every problem. Hence, abstractions are actually very customized for the application.”

Does the standard cover enough ground? That is part of the purpose of the review period that was used to collect information. “I can see some room for future improvement — for example, making some attributes mandatory like logic, associated_clocks, clock_period for clock ports,” says Nasr. “Another proposed improvement is adding reconvergence information, to be able to detect reconverging outputs of parallel synchronizers.”

The impact of all of this, if realized, is enormous. “If you truly run a collaborative, inclusive, development cycle, two things will happen,” says Olopade. “One, you are going to be able to find multiple ways to solve each problem. You need to understand the pros and cons against the real problems you are trying to solve and agree on the best way we should do it together. For each of those, we record the options, the pros and cons, and the reason one was selected. In a public review, those that couldn’t be part of that discussion get to weigh in. We weigh it against what they are suggesting versus why did we choose it. In the cases where it is part of what we addressed, and we justified it, we just respond, and we do not make a change. If you’re truly inclusive, you do allow that feedback to cause you to change your mind. We received feedback on about three items that we had debated, where the feedback challenged the decisions and got us to rehash things.”

The big challenge
Still, the creation of a standard is just the first step. Unless a standard is fully adopted, its value becomes diminished. “It’s a commendable objective and a worthy endeavor,” says Schirrmeister. “It will make interoperability easier and eventually allow us, and the whole industry, to reduce the compatibility matrix we maintain to deal with vendor tools individually. It all will depend on adoption by the vendors, though.”

It is off to a good start. “As with any standard, good intentions sometimes get severed by reality,” says Campbell. “There has been significant collaboration and agreements on how the standard is being pushed forward. We did not see self-submarining, or some parties playing nice just to see what’s going on but not really supporting it. This does seem like good collaboration and good decision making across the board.”

Implementation is another hurdle. “Will it actually provide the benefit that it is supposed to provide?” asks Narain. “That will depend upon how completely and how quickly EDA tool vendors provide support for the standard. From our perception, the engineering challenge for implementing this is not that large. When this is standardized, we will provide support for it as soon as we can.”

Even then, adoption isn’t a slam dunk. “There are short- and long-term problems,” warns Campbell. “IP vendors already have to support multiple formats, but now you have to add Accellera on top of that. There’s going to be some pain both for the IP vendors and for EDA vendors. We are going to have to be backward-compatible and some programs go on for decades. There’s a chance that some of these models will be around for a very long time. That’s the short-term pain. But the biggest hurdle to overcome for a third-party IP vendor, and EDA vendor, is quality assurance. The whole point of a hierarchical development methodology is faster CDC closure with no loss in quality. The QA load here is going to be big, because no customer is going to want to take the risk if they’ve got a solution that is already working well.”

Some of those issues and fears are expected to be addressed at the upcoming DVCon conference. “We will be providing a tutorial on CDC,” says Olopade. “The first 30 minutes covers the basics of CDC for those who haven’t been doing this for the last 10 years. The next hour will talk about the Accellera solution. It will concentrate on those topics which were hotly debated, and we need to help people understand, or carry people along with what we recommend. Then it may become more acceptable and more adoptive.”

Related Reading
Design And Verification Methodologies Breaking Down
As chips become more complex, existing tools and methodologies are stretched to the breaking point.

The post Accellera Preps New Standard For Clock-Domain Crossing appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • What’s Next For Power Electronics? Beyond SiliconEmily Yan
    For more than half a century, silicon has been the bedrock of power electronics. Yet as silicon meets its physical limitations in higher-power, higher-temperature applications, the industry’s relentless pursuit of more efficient power systems has ushered in the wide bandgap (WBG) semiconductors era. The global WBG semiconductors market reached $1.6 billion in 2022, with an estimated CAGR of $13% for the next 8-year period. The adoption of WBG semiconductors, notably silicon carbide (SiC) and gal
     

What’s Next For Power Electronics? Beyond Silicon

Od: Emily Yan
29. Únor 2024 v 09:05

For more than half a century, silicon has been the bedrock of power electronics. Yet as silicon meets its physical limitations in higher-power, higher-temperature applications, the industry’s relentless pursuit of more efficient power systems has ushered in the wide bandgap (WBG) semiconductors era. The global WBG semiconductors market reached $1.6 billion in 2022, with an estimated CAGR of $13% for the next 8-year period. The adoption of WBG semiconductors, notably silicon carbide (SiC) and gallium nitride (GaN), is now setting new benchmarks for performance in power systems across automotive, industrial, and energy sectors. What impact will WBG semiconductors have on power electronics (PE) trends in 2024, and how are they redefining the design and simulation workflows for the next decade?

The catalyst for change: Wide bandgap

The term ‘bandgap’ refers to the energy difference between a material’s insulating and conducting states, a critical factor determining its electrical conductivity.

As shown in figure 1, with its wide bandgap, Gallium Nitride (GaN) exemplifies the three key advantages this property can offer.

Fig. 1: Wide bandgap semiconductor properties.

  • Faster switching speeds: One of the most significant benefits of GaN’s wide bandgap is its contribution to faster switching speeds. The electron mobility in GaN is around 2,000 cm²/Vs, enabling switching frequencies up to 10 times higher than silicon. A higher switching speed translates into reduced switching losses, making the overall designs more compact and efficient.

Fig. 2: Switching speeds of SiC and GaN.

  • Higher thermal resistance: With a thermal conductivity of 2 W/cmK, GaN can dissipate heat and operate at temperatures up to 200°C efficiently. This resilience enables more effective thermal management at high temperatures and extreme conditions.
  • Higher voltages: With an electric breakdown field of 3.3 MV/cm, GaN can withstand almost 10 times silicon’s voltage.

GaN and other wide-bandgap semiconductors offer solutions for high-power, high-frequency, and high-temperature applications with improved energy efficiency and design flexibility.

Emerging challenges in power electronics simulation

The wider adoption of wide bandgap (WBG) semiconductors, including silicon carbide (SiC) and gallium nitride (GaN), sets new standards in the design of Switched Mode Power Supplies (SMPS) – power efficiency, compactness, and lighter weight.

However, the higher switching speeds of GaN and SiC demand more sophisticated design considerations. Power electronics engineers must manage electromagnetic interference (EMI) and optimize thermal performance to ensure reliability and functionality. Layout parasitics, for instance, can lead to voltage spikes in the presence of high di/dt values. Power electronics engineers face the following pressing questions:

  • How can we guarantee reliability for mission-critical applications across a wider range of operating temperatures?
  • What are the essential practices for understanding and predicting EMI and noise?
  • What tools can we employ to create robust thermal models for comprehensive system-level analysis?

To navigate these complexities, engineers need advanced simulation solutions to address layout parasitic effects effectively and come with robust thermal analysis.

Fig. 3: Thermal analysis at board and schematic levels using Keysight’s PEPro.

Key impacts of WBG semiconductors: From EVs to renewable energy

Electric vehicles (EVs): As global EV sales are projected to increase by 21% in 2024, the power efficiency of automotive power electronics is paramount – every additional percentage is a big win. GaN enables more compact and efficient designs of onboard chargers and traction inverters, extending driving ranges by up to 6%.

Data centers: The digital economy’s expansion brings a surge in data center energy consumption, with the U.S. expected to require an additional 39 gigawatts over the next five years—equivalent to powering around 32 million homes. Wide bandgap semiconductors may be key to addressing this challenge by enabling higher server densities and reducing energy consumption and carbon emissions. Specifically, the implementation of GaN transistors in data center infrastructure can lead to a reduction of 100 metric tons of CO2 emissions for every 10 racks annually. This efficiency gain is particularly relevant as the computational and power demands of artificial intelligence (AI) applications soar, potentially tripling the racks’ power density.

Renewable energy: Wide bandgap semiconductors allow for more reliable power output and cost-effective solutions in both residential and commercial renewable energy storage systems. For instance, GaN transistors could achieve four times less power loss than traditional silicon-based power solutions.

The road ahead in the era of WBG semiconductors

GaN and SiC represent a new wave of material innovation to elevate the efficiencies of power electronics and redefine how we power our world. As the applications for WBG semiconductors expand, Keysight empowers our customers with a unified simulation environment to design reliable and long-lasting electronic systems under various operating conditions.

The post What’s Next For Power Electronics? Beyond Silicon appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • NoC Development – Make Or Buy?Frank Schirrmeister
    In the selection and qualification process for semiconductor IP, design teams often consider the cost of in-house development. Network-on-Chip (NoC) IP is no different. In “When Does My SoC Design Need A NoC?” Michael Frank and I argued that most of today’s designs – even less complex ones – can benefit from NoCs. In the blog “Balancing Memory And Coherence: Navigating Modern Chip Architectures,” I discussed the complexity that coherency adds to on-chip interconnect. After I described some of th
     

NoC Development – Make Or Buy?

In the selection and qualification process for semiconductor IP, design teams often consider the cost of in-house development. Network-on-Chip (NoC) IP is no different. In “When Does My SoC Design Need A NoC?” Michael Frank and I argued that most of today’s designs – even less complex ones – can benefit from NoCs. In the blog “Balancing Memory And Coherence: Navigating Modern Chip Architectures,” I discussed the complexity that coherency adds to on-chip interconnect. After I described some of the steps of NoC development based on what ChatGPT 3.5 recommended in “Shortening Network-On-Chip Development Schedules Using Physical Awareness,” it’s time to look at more detail at the development efforts that design teams would have to invest to develop coherent NoCs from scratch.

ChatGPT, here we go again!

The prompt “Tell me how to develop an optimized network-on-chip for semiconductor design, considering the aspects of cache coherency” gives an excellent starting point in ChatGPT 4.0.

Understanding Protocols: First, one needs to understand cache coherency protocols. The recommendation is to study existing protocols before selecting one. Specifically, understand existing cache coherency protocols like MESI (Modified, Exclusive, Shared, Invalid), MOESI (Modified, Owned, Exclusive, Shared, Invalid), and directory-based protocols. Analyze their strengths and weaknesses in terms of scalability, latency, and bandwidth requirements. Then, choose a protocol that aligns with your performance goals and the scale of your NoC. Directory-based protocols are often preferred for larger-scale systems due to their scalability.

ChatGPT’s recommendation for the first step is a good start. I previously discussed the complexity of specific protocols like AMBA AXI, APB, ACE, CHI, OCP, CXL, and TileLink in “Design Complexity In The Golden Age Of Semiconductors.” One must read several thousand pages of documentation to understand the options here. And – by the way – these are orthogonal to the MESI/MOESI commentary from ChatGPT above, as these are implementation choices. In a practical scenario, many of these aspects depend on the building blocks the design team wants to license, like processors from the Arm, RISC-V, Arc, Tensilica, CEVA, and other ecosystems, as well as the protocol support in design IP blocks (think PCIe, UCIe, LPDDR) and accelerators for AI/ML.

NoC Architecture Design: Second, ChatGPT recommends focusing on NoC architecture design. Decide on the NoC topology (e.g., mesh, torus, tree, or custom designs) based on the expected traffic pattern and the scalability requirements. Each topology has its specific advantages, as my colleague Andy Nightingale recently explained here. Furthermore, teams must design efficient routers to handle the chosen cache coherency protocol with minimal latency, implementing features like virtual channels to avoid deadlock and increase throughput. The final part of this step involves optimizing the network for bandwidth and latency by tuning the buffer sizes, employing efficient routing algorithms, and optimizing link widths and speeds.

Cache Coherency Mechanism Integration: Next up, ChatGPT recommends integrating the actual mechanisms of cache coherency. Integrating the cache coherency mechanism with the NoC involves efficient propagation of coherency messages (e.g., invalidate, update) across the network with minimal latency. Designing an efficient directory structure for directory-based protocols that can scale with the network and minimize the coherency traffic requires careful considerations of the placement of directories and the granularity of coherence (e.g., block-level vs. cache-line level).

By the way, for my query, it leaves out the option to handle coherency fully in software.

Simulation and Analysis: At this point, ChatGPT correctly recommends using simulation tools to model your NoC design and evaluate its performance under various workloads. Tools like Gem5, NS-3, or custom simulators can be helpful. I would add SystemC models to the arsenal of tools design teams working on this from scratch could use. Teams need to analyze key performance metrics such as latency, throughput, and energy consumption and pay special attention to the performance of the cache coherency mechanisms.

The last bit is indeed critical as for coherent interconnects, the cost of a cache miss is drastically different from a cache hit.

Optimization and Scaling: This recommendation includes implementing adaptive routing algorithms and dynamic power management techniques to optimize performance and energy efficiency under varying workloads and ensuring the design can scale by adding more cores. This might involve modular design principles or hierarchical NoC structures.

Correct. But, in all practicality, at this point during the project, a lot of time has passed without writing a single line of RTL. Management will ask, “What’s up here?” So, some RTL coding has already happened at this point. Iterations happen fast. Engineers will blame marketing quickly for iterative feature change requests like adding/removing interfaces, changing user bits, Quality of Service (QoS) requirements, address maps, safety needs, buffering, probes, interrupts, modules, etc. All of these can cause significant changes to the RTL. At this point, the consideration has sometimes not started yet that the floorplan can cause more issues because of interface location, blockages, and fences.

Prototyping and Testing: Next, the recommendation is to use FPGA-based prototyping to validate your NoC design in hardware and to test the NoC in the context of the entire system, including the processor cores, memory hierarchy, and peripherals, to identify any issues with the cache coherency mechanism or other components.

True. Emulation and FPGA-based prototyping have become standard steps for most complex designs today. And especially the aspects of cache coherency in the context of the overall system and its software require very long test sequences.

Iterative Design and Feedback: The last recommendation is to use feedback from the simulation, prototyping, and testing phases to refine the NoC design iteratively and benchmark the final design using standard benchmark suites relevant to your target application domain to ensure that it meets the desired performance and efficiency goals.

The cost of “make”

Bar hiring a team of architects with relevant NoC development experience, the first five steps of Understanding Protocols, NoC Architecture Design, Cache Coherency Mechanism Integration, Simulation & Analysis, and Optimization & Scaling will take significant learning time, and writing the implementation specs is far from trivial.

Then, teams will spend most of the effort on RTL development and verification. Just imagine writing RTL protocol adapters for AMBA CHI-E, CHI-B, ACE, ACE-LITE, and AXI – connecting tens of IP blocks coherently – to address coherent and IO coherent use models. Even if you can reuse VIP from EDA vendors to check the protocol correctness, the effort is significant just for unit verification, as you will run thousands of tests.

For the actual interconnect, whether you use a heterogenous, ring, or mesh topology, the effort for development is significant. The logic that deals with directories to enable cache coherency can be complicated. And any change requests require, of course, re-coding!

Finally, when integrating everything in the system context, the effort to validate integration issues, including bring-up in emulation and associate debug, consumes another chunk of effort.

Our customers tell us that when it is all said and done, they would easily spend over 50 person-years just on coherent NoC development for complex designs.

Network-on-chip automation for productivity and configurability.

Automation potential: What to expect from coherent NoC IP

There is a lot of automation potential in the seven steps above!

  • The various relevant protocols can be captured in a library of protocol converters, reducing the need to internalize and implement all the protocols the reused IP blocks speak of. Ideally, they would already be pre-validated with popular IP blocks from leading IP vendors – think providers of Arm and RISC-V ISAs and vendors for interface blocks like LPDDR, PCIe, UCIe, etc., or graphics and AI/ML accelerators.
  • Graphical user interfaces and scripting APIs increase productivity in developing NoC architectures and topologies.
  • Like protocol converters, reusable blocks for directory and cache coherency management can increase development productivity. Their verification is especially critical, so ideally, your vendor has pre-verified them using VIP from EDA vendors and pre-validated the system integration with the ecosystem (think processor providers).
  • The refinement loop is probably the most critical one to optimize. Refinement can be deadly in manual scenarios. Besides reusable building blocks, you should look for configuration tools to automatically create performance models for architectural analysis, export new RTL configurations, and directly connect to digital implementation flows.

The verdict: Make or buy?

The illustration above summarizes some of the automation potential for NoCs. Saving more than 50 person-years is an attractive option compared to developing NoC IP from scratch. Check out what Arteris does in this domain with its Ncore Cache Coherent Interconnect IP and FlexNoC 5 Interconnect IP for non-coherent designs.

The post NoC Development – Make Or Buy? appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Navigating IoT SecurityDana Neustadter
    By Dana Neustadter (Synopsys), Ruud Derwig (Synopsys), and Martin Rösner (G+D) IoT expansion requires secure and efficient connectivity between machines. Integrated SIM technology and remote SIM provisioning can make this possible. Subscriber Identity Module (SIM) cards have been around for a long time, with Giesecke+Devrient (G+D) developing and delivering the first commercial SIM cards in 1991. If you have a cell phone, you will be familiar with these small security anchors that protect phones
     

Navigating IoT Security

29. Únor 2024 v 09:03

By Dana Neustadter (Synopsys), Ruud Derwig (Synopsys), and Martin Rösner (G+D)

IoT expansion requires secure and efficient connectivity between machines. Integrated SIM technology and remote SIM provisioning can make this possible.

Subscriber Identity Module (SIM) cards have been around for a long time, with Giesecke+Devrient (G+D) developing and delivering the first commercial SIM cards in 1991. If you have a cell phone, you will be familiar with these small security anchors that protect phones, networks, and data from fraud and misuse. They also enable phones to securely authenticate and communicate within a mobile network infrastructure managed by carriers. With deployment starting over the last few years, iSIM, also called an integrated Universal Integrated Circuit Card (iUICC), is the newest SIM kid on the block. iSIMs are embedded directly into a system on chip (SoC) as a tamper-resistant secure element, bringing trust and enabling secure connectivity and control, while saving cost and space, simplifying the system development process, and overall offering a significant ease-of-use improvement in the way IoT device connectivity is activated and secured. In addition to the functional advantages, the iSIM also offers sustainability benefits such as reduced CO2 emissions.

Enabling iSIM to support remote SIM provisioning (RSP) calls for an integrated solution that brings together secure services and a secure SIM operating system (OS) with secure hardware. This is the type of challenge that has led to a collaboration between G+D, which continues to lead the way in SIM innovation, and Synopsys, with its expertise in trusted hardware. Putting their heads together, the two companies have come up with an innovative integrated, secure iSIM solution. In a nutshell, Synopsys tRoot Hardware Secure Modules (HSMs) provide silicon-proven, self-contained security IP solutions with root of trust. The HSMs are combined with G+D’s secure SIM OS to enable tamper-resistant elements which are usable within an SoC and serve as an isolated hardware component. G+D’s award-winning RSP services provide seamless management of the SIM profiles. Figure 1 depicts the secure iSIM solution.

“Offering the promise of seamless secure management of SIM profiles, iSIMs help accelerate the broad scaling of the IoT by providing high flexibility to choose the preferred cellular networks throughout the lifetime of devices,” said Andreas Morawietz, global head of Digital Connectivity Portfolio Strategy at G+D. “Our standards-compliant remote SIM provisioning service together with the secure SIM OS integrated with Synopsys’ tRoot Hardware Secure Modules provide an integrated iSIM secure solution at the start of the IoT value chain, delivering benefits to downstream IoT players.”

Fig. 1: Integrated, secure iSIM solution featuring Synopsys and G+D technologies. 

Evolution of SIM technology charts course for the IoT

The breadth of the IoT has become expansive thanks to the prevalence of cellular networks, sensors, cloud computing, AI, and other technologies that enable connectivity and intelligence. Consider, as one example, all the devices and systems that can bring a smart city to life. From traffic signals and streetlights to meters and energy grids, each of these systems must be able to collect and share data that leads to better decision-making and outcomes, as well as more efficient and effective processes. With the integration of AI capabilities, these devices would also be able to act autonomously. SIM technology acts as a trust anchor for secure identification, authentication, and communication. Over the years as new devices enter the IoT realm, users have expectations of increasingly seamless connectivity, simple remote management, and the ability to select their preferred carriers. This has seen removable physical SIMs giving way to embedded SIMs (eSIM) which are soldered onto devices.

As the newest entry in this evolution, iSIMs are anticipated to grow in popularity, answering the call for more optimized, flexible, and secure solutions to allow more things to be connected and controlled. Because it isn’t a disparate chipset, an iSIM provides cost, power, and area efficiency, ideal for small, battery-powered IoT devices, particularly those that operate in low-power wide area networks (LPWANs) through narrowband IoT (NB-IoT) or long-term evolution for machines (LTE-M) technologies. iSIMs also work well in larger industrial systems such as smart meters or even vehicles. In such systems, the SIM technology could be located in hard-to-reach places, making remote management an ideal approach.

While iSIM and remote provisioning are opening up a new ecosystem, these capabilities are only appealing if they’re backed by rock-solid security. Fortunately, there’s quite a lot of technology available to secure network connections and authenticate communicating partners. iSIMs must be developed to offer the same level of security as traditional SIM solutions. For network operators to trust iSIMs, security certification of iSIMs is essential. This is especially important since the network operators don’t control the SIM hardware and software, as it comes with the IoT device and can originate from any number of vendors.

Complete, integrated IoT security solution

The Synopsys and G+D collaboration has been successfully deployed in the field and acknowledged by Tier1 operators for several years. Our efforts bring together complementary technologies that form a complete iSIM security solution for integration into a baseband SoC.

Synopsys’ tRoot HSMs are ideal for SoCs supporting a variety of applications in addition to the IoT, including industrial control, networking, automotive, media, and mobile devices. A hardware root of trust allows chip manufacturers and their OEM customers to create a strong cryptographic device identity for a unique device instance and provides a secure environment for protecting sensitive data and operations. In addition to the secure SIM OS, G+D also provides remote provisioning and device management secure services. The resulting iSIM solution comes with a small footprint and low power consumption and allows for more efficient production, faster time to market and, without extra housing or plastic, greater sustainability. With encrypted loading of chip-unique data (the SIM BLOB, or binary large object), the iSIM can be installed on a chipset without certification of the production facility.

As smart, connected devices become more ubiquitous, ensuring trust that the devices and their data will remain safe from security threats will remain a top priority. Secure remote SIM provisioning helps to streamline device connections and controls in a safe manner. Through our collaboration, Synopsys and G+D are providing mobile network operators and semiconductor manufacturers with a complete security solution for all-in-one connectivity that can nurture the continued expansion of the IoT.

For more information, see Synopsys tRoot Hardware Secure Modules (HSMs).

Ruud Derwig is a principal software engineer for Security IP Solutions at Synopsys.

Martin Rösner is the director for the Digital Connectivity Portfolio Strategy at G+D.

The post Navigating IoT Security appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Weak Verification Plans Lead To Project DisarrayAnika Sunda
    The purpose of the verification plan, or vplan as we call it, is to capture all the verification goals needed to prove that the device works as specified. It’s a big responsibility! Getting it right means having a good blueprint for verification closure. However, getting it wrong could result in bug escapes, wasting of resources, and possibly lead to a device failing altogether. With the focus on AI-driven verification, the efficiency and effectiveness of verification planning are expected to im
     

Weak Verification Plans Lead To Project Disarray

29. Únor 2024 v 09:02

The purpose of the verification plan, or vplan as we call it, is to capture all the verification goals needed to prove that the device works as specified. It’s a big responsibility! Getting it right means having a good blueprint for verification closure. However, getting it wrong could result in bug escapes, wasting of resources, and possibly lead to a device failing altogether. With the focus on AI-driven verification, the efficiency and effectiveness of verification planning are expected to improve significantly.

There are several key elements needed to create a good vPlan. We will go over a few below.

Accurate verification features are needed for verification closure

The concept of divide and conquer suggests that every complex feature can be broken down into sub-features, which in turn can be further divided. Verisium Manager’s Planning Center facilitates this process by enabling users to create expandable/collapsible feature sections, a crucial capability for maintaining quality. Not having this key capability can put quality at risk.

Close alignment to the functional specification

Close adherence to the functional specification is crucially linked to the first point. Any new features or changes to existing ones should prompt immediate updates to the vPlan, as failing to do so could affect verification quality. The Planning Center allows users to associate paragraphs in the specification to the vPlan and provides notifications of any corresponding alterations. This allows users to respond by adjusting the vPlan accordingly in alignment with the specifications.

Connecting relevant metrics, vPlan features

Once the vPlan is defined, it’s important to connect the relevant metrics to demonstrate verification assurance of each feature. It may involve using a combination of code coverage, functional coverage, or directed test to provide that assurance. The Planning Center makes connecting these metrics to the vPlan very straightforward. Failing to link these metrics with the features could result in insufficiently verified features.

Showing real-time results

To effectively monitor progress and respond promptly to areas requiring attention, the vPlan should dynamically reflect the results in real time. This allows for accurate measurement of progress and focused allocation of resources. Delayed results could lead to wasted project time in non-priority areas. Verisium Manager’s vPlan Analysis automates this process enabling users to view real-time vPlan status for relevant regressions.

Customers have shared that vPlan quality significantly influences project outcomes. It’s crucial to prioritize creating higher quality vPlans, rather than simply focusing on speed. However, maintaining consistent high quality can be challenging due to the human tendency to quickly lose interest, with initial strong efforts tapering off as the process continues.

A thorough verification plan is the key to success in ASIC verification. Verification reuse is critical to the productivity and efficiency of system-on-chip (SoC), and a good vPlan is the first step in this direction. If you’re a verification engineer, take the time to develop a thorough verification plan for your next project. It will be one of the best investments you can make in the success of your project.

The post Weak Verification Plans Lead To Project Disarray appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Blog Review: Feb. 21Jesse Allen
    Siemens’ John McMillan digs into physical verification maturity for high-density advanced packaging (HDAP) designs and major differences in the LVS verification flow compared to the well-established process for SoCs. Synopsys’ Varun Shah identifies why a cloud adoption framework is key to getting the most out of deploying EDA tools in the cloud, including by ensuring that different types of necessary compute are accessible for all stages of the design cycle. Cadence’s Reela Samuel suggests that
     

Blog Review: Feb. 21

21. Únor 2024 v 09:01

Siemens’ John McMillan digs into physical verification maturity for high-density advanced packaging (HDAP) designs and major differences in the LVS verification flow compared to the well-established process for SoCs.

Synopsys’ Varun Shah identifies why a cloud adoption framework is key to getting the most out of deploying EDA tools in the cloud, including by ensuring that different types of necessary compute are accessible for all stages of the design cycle.

Cadence’s Reela Samuel suggests that a chiplet-based approach will provide improved performance and reduced complexity for the automotive sector, enabling OEMs to construct a robust yet flexible electronic architecture.

Keysight’s Emily Yan finds that today’s chip design landscape is facing challenges reminiscent of those encountered by the Large Hadron Collider in managing data volume, version control, and global collaboration.

Ansys’ Raha Vafaei explains why the finite-difference time-domain (FDTD) method, an algorithmic approach to solving Maxwell’s equations, is key for modeling nanophotonic devices, processes, and materials.

Arm’s Ed Player explains the different components of the Common Microcontroller Software Interface Standard (CMSIS) to help identify which are useful for particular Arm-based microcontroller projects.

SEMI’s Mark da Silva, Nishita Rao and Karim Somani check out the state of digital twins in semiconductor manufacturing and challenges such as the need for standardization and communication between different digital twins.

Plus, check out the blogs featured in the latest Low Power-High Performance newsletter:

Rambus’ Lou Ternullo looks at why performance demands of generative AI and other advanced workloads will require new architectural solutions enabled by CXL.

Ansys’ Raha Vafaei shines a light on how the evolution of photonics engineering will encompass novel materials and cutting-edge techniques.

Siemens’ Keith Felton explains why embracing emerging approaches is essential for crafting IC packages that address the evolving demands of sustainability, technology, and consumer preferences.

Cadence’s Mark Seymour lays out how CFD simulation software can predict time-dependent aspects and various failure scenarios for data center managers.

Arm’s Adnan Al-Sinan and Gian Marco Iodice point out that LLMs already run well on small devices, and that will only improve as models become smaller and more sophisticated.

Keysight’s Roberto Piacentini Filho shows how a modular approach can improve yield, reduce cost, and improve PPA/C.

Quadric’s Steve Roddy finds that smart local memory in an AI/ML subsystem solves SoC bottlenecks.

Synopsys’ Ian Land, Kenneth Larsen, and Rob Aitken detail why the traditional approach using monolithic system-on-chips (SoCs) falls short when addressing the complex needs of modern systems.

The post Blog Review: Feb. 21 appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Latency, Interconnects, And PokerEd Sperling
    Semiconductor Engineering sat down with Larry Pileggi, Coraluppi Head and Tanoto Professor of Electrical and Computer Engineering at Carnegie Mellon University, and the winner of this year’s Phil Kaufman Award for Pioneering Contributions. What follows are excerpts of that conversation. SE: When did you first get started working in semiconductors — and particularly, EDA? Pileggi: This was 1984 at Westinghouse research. We were making ASICs — analog and digital — and with digital you had logic si
     

Latency, Interconnects, And Poker

20. Únor 2024 v 09:09

Semiconductor Engineering sat down with Larry Pileggi, Coraluppi Head and Tanoto Professor of Electrical and Computer Engineering at Carnegie Mellon University, and the winner of this year’s Phil Kaufman Award for Pioneering Contributions. What follows are excerpts of that conversation.

Semiconductor Engineering sat down with Larry Pileggi, Coraluppi Head and Tanoto Professor of Electrical and Computer Engineering at Carnegie Mellon University, and the winner of this year's Phil Kaufman Award for Pioneering Contributions. What follows are excerpts of that conversation.SE: When did you first get started working in semiconductors — and particularly, EDA?

Pileggi: This was 1984 at Westinghouse research. We were making ASICs — analog and digital — and with digital you had logic simulators. But for analog, there was no way to simulate them at Westinghouse. They didn’t even have SPICE loaded onto the machine. So I got a copy of SPICE from Berkeley and loaded that tape, and I was the first to use it in the research center. I saw how limited it was and thought, ‘There must be more mature things than this.’ While I was working there, I was taking a class with Andrzej Strojwas at CMU (Carnegie Mellon University). He came up to me after a few weeks in that class and said, ‘I really think you should come back to school for a PhD.’ I had never considered it up until that point. But getting paid to go to school? That was cool, so I signed up.

SE: Circuit simulation in analog is largely brute force, right?

Pileggi: The tools that are out there are really good. There are many SPICEs out there, and they all have their niches that can do really great things. But it’s not something you can easily scale. That’s really been a challenge. There’s brute force in the innermost loop, but you can accelerate it with hardware.

SE: What was the ‘aha’ moment for you with regard to dealing with latency of interconnect as interconnect continued to scale?

Pileggi: There was some interest in looking at RC networks that appeared on chips as sort of a special class of problem. Paul Penfield and others at MIT did this Elmore approximation of RC lines using the first moment of the impulse response. It’s from a 1930s Elmore paper about estimating the delay of amplifiers. Mark Horowitz, a student of Penfield, tried to extend that to a few moments. What we did was more of a generalized approach, using many moments and building high-order approximations that you could apply to these RC lines. So you’re really using this to calculate that the dominant time constants, or the dominant poles, in the network. And for RC circuits, what’s really interesting is that the bigger the network gets, the more dominant the poles get. So you could have a million nodes out there — and it’s a million capacitors and a million poles — but for an RC line, three of them will model it really well. That makes things really efficient, providing you can capture those three efficiently. I was naive, not knowing that French mathematicians like [Henri] Pade already had attempted Pade approximations long before. I dove in like, ‘Oh, this should work.’ And I ran into a lot of the realities for why it doesn’t work. But then I was able to apply some of the circuit know-how to squeeze it into a place where it worked very effectively.

SE: A lot of that early work was around radio signals. But as you move that into the computing world, what else can you do with that? And if you now don’t have to put everything onto a single chip, does that change things?

Pileggi: Let’s take the power distribution for an IC, for example. That’s primarily dominated on the chip by RC phenomenon. The resistance far dominates the jωL impedance — the inductance. But as you move to a package, that’s different. If you put different chips together, whether you stack them or you put them on an interposer, inductance starts to rear its ugly head. Inductance is extraordinarily nasty to model and simulate. The problem is that when you when you look at capacitances, that’s a that’s a potential matrix where you take the nearest couplings, and say, ‘Okay, I have enough of this capacitance to say this is going to dominate the behavior.’ You’re essentially throwing away what you don’t need. With inductance, there’s a one-over relationship as compared to capacitance. Now, if you want the dominant inductance effect, that’s not so easy to get. If you have mutual couplings from everything to everything else, and if you say I’m going to throw away the couplings to faraway things, that’s a seemingly reasonable thing to do from an accuracy standpoint, but it affects the stability of the approximation. Essentially it can violate conservation of flux, such that you get positive poles. So you can actually create unstable systems by just throwing away small inductance terms. Usually when you see someone calculating inductance, it’s really just an estimate — or they’ve done some things to crunch it into a stable model.

SE: Is that simulation based on the 80/20 rule, or 90/10?

Pileggi: Even for the packages we had before we started doing the multi-chip stuff, power distribution was RC, but when you flip it into a package with many layers of metal, it’s LC. We had the same problem for the past 20 years, but what happens has been managed by good engineers. They apply very conservative methods to make sure the chips will work.

SE: So now, when you pile that into advanced nodes and packages and eliminate all that margin, you’ve got serious challenges, right?

Pileggi: Yes, and that’s why it was a good time for me to switch to electric power grids.

SE: Power grids for our communities have their own set of problems, though, like localization and mixing direct and alternating current, and a bunch of inverters.

Pileggi: It’s a fascinating problem. When I first stepped into it, a student of mine started telling me about how they did simulation. I said, ‘Wow, that doesn’t make any sense.’ I naively thought it was just like a big circuit, but it’s much more than that. It’s a very cool problem to work on. We’ve developed a lot of really exciting technology for that problem. With inverters, there’s a whole control loop. There isn’t the inertia that you have with big rotating machines that are fed by coal. But you have all these components on the same grid. How the grid behaves dynamically is a daunting problem.

SE: Does that also vary by weather? You’ve got wide variations in ambient temperature and all sorts of noise to contend with.

Pileggi: Yes, absolutely. In fact, how the lines behave is very much a function of temperature. That affects how resistant the transmission lines are. Frequency is very low, but the lengths are very long, so you have similar problems, but even more so with renewables. There’s sun, then a cloud, then sun. Or the wind changes direction. How do you store energy for use later? That’s where they talk about heavy batteries in the ground and things like that. Doing this with an old grid, like the one we have, is challenging. I’d much rather be starting from scratch.

SE: When you got started in electronics, was it largely the domain of some very big companies with very big research budgets?

Pileggi: Yes, and this is where you saw where management really makes a difference. Some of those companies, like Westinghouse Research, had these incredible R&D facilities, but they didn’t utilize them effectively, like all the gallium arsenide research where I was working. It seemed that every time we would develop something to improve something, the management didn’t always know what to do with it. I worked with some of the smartest people I’ve ever met, and they had worked on projects like the first camera in space, but they were living in obscurity. Nobody knew anything about their work, but it was just amazing.

SE: One other math-related question. You apparently have a reputation for being a very strong poker player. How did these two worlds collide?

Pileggi: I was in Las Vegas for a DARPA meeting and I had an afternoon off and there was a Texas Hold’em poker tournament going on. I thought it would be kind of fun, so I played four or five hours, got knocked out, and it cost me 100 bucks. I was intrigued by it, though. I went back to Pittsburgh and found our local casino had started a poker room with tournaments. I started getting better, probably because I read like 30 books on the subject. The more you play, the more you realize there are lots of layers to this. I ultimately played in the World Series in Vegas, because it’s like a bucket-list thing, and that first time I made it to day two of the main event. That’s equivalent to finishing in the top 40% of the field. When I was back in Pittsburgh, there was a ‘Poker Night in America’ event at the casino. There were about 300 people and some pros. I played in that, and won first place. That was a Saturday around Thanksgiving in 2013. We played from noon until just after midnight, and then you start again on Sunday. We played until maybe 5 a.m.

SE: That must have taken a toll.

Pileggi: Yes, because I was chairing the search for new department heads. I had a Monday morning meeting scheduled that I couldn’t miss, so I e-mailed everyone to say I would be an hour late and asked if they could push back the meeting. I went home and ate something, slept for an hour, and went to campus to do the final vote. They asked, what happened? I said I was in a poker tournament. They thought I was joking. But then they saw me on TV. All the local news stations covered it like, ‘Local professor skips school.’ I got a call from someone I hadn’t talked to in 34 years. My dean said his son thought engineering was stupid. But then he found out than this engineer won this poker tournament, and now he thinks engineering is really cool.’

SE: How did that affect your engineering classes?

Pileggi: I introduced myself to a group of students here two years ago when I became the department head and asked if they had any questions. One young lady raised her hand and said, ‘Yeah, can you teach us how to play poker?’ So now I do a poker training session with students once a semester.

The post Latency, Interconnects, And Poker appeared first on Semiconductor Engineering.

❌
❌