FreshRSS

Zobrazení pro čtení

Jsou dostupné nové články, klikněte pro obnovení stránky.

Chip Aging Becoming Key Factor In Data Center Economics

Chip aging is becoming a much bigger concern inside of data centers, where it can impact server uptime, utilization rates, and the amount of energy needed to drive signals and cool entire server racks.

Aging in chips is the result of both higher logic utilization and increasing transistor density. This is problematic for data centers, in general, but especially for AI chips where digital logic is expected to run at maximum speed. That generates more heat, which becomes harder to dissipate as the number specialized and general-purpose processing elements per square millimeter of silicon continues to rise. Heat typically gets trapped between the fins of finFETs and gate-all-around FETs, accelerating electromigration and reducing the time it takes for dielectrics to break down. It also can cause warpage, which can rupture the bonds and contacts between different components in an advanced package or on a PCB.

For data centers, that creates a number of challenges:

  • Thermal management: This requires a deep understanding of workloads and the resulting transient thermal gradients as processing is load-balanced on-chip, between chips or chiplets, and between servers;
  • More data: Data from sensors everywhere, along with larger training sets, all need to be processed faster than in the past to keep up with the flood of data, but all of that needs to happen in the same or smaller footprint without overheating any part of a device, and
  • In-circuit monitoring: Sensors can be added into chips to detect variations in heat and data speeds in different paths, but it’s much more difficult to keep track of tens of thousands of these monitors as they collect data from heterogeneous processing elements, each of which can age at different rates depending on process variation, defectivity, varying workloads, and ambient thermal conditions.

“Servers are much more capable today than they were 10 years ago, and the issue is that power hasn’t scaled like it used to,” said Steven Woo, Rambus fellow and distinguished inventor. “Now, if you want to do lots more work in your server, you have to burn more power to do it. Twenty years ago, a server might dissipate a couple hundred watts. But with the latest servers that NVIDIA just announced around Grace Blackwell, the whole rack is 120 kilowatts, and the individual servers are many kilowatts. Just delivering power into those racks is causing changes in the infrastructure in the industry. Now that you have to bring in and dissipate more power in a small space, you get all kinds of interesting things that could happen over time. The heat that’s being dissipated can have effects on the chip, and you have to worry sometimes about thermal cycling where, as the chip is doing a lot of work, maybe part of the chip stops and then it does more work. You get these rapid cycles of dissipating a lot of power, then not, then dissipating a lot of power, then not. That cycling causes local heating and cooling, leading to thermal stresses, and this impacts all chips, including memory.”

As a result, everyone from the data center manager to the chip architect now has to understand how a chip behaves in the field, and how increasingly customized chip and system architectures will function over time. Downtime is costly for a data center, but under-utilization and reduced performance also carries a high price tag. That, in turn, affects how much margin is considered essential, such as extra data paths if some of them are fully or partially closed off by electromigration, and how that margin will impact performance, power, and area/cost over a chip’s projected lifetime — especially in a heterogeneous design with specialized compute elements.

“When it comes to the hyper-scalers and high powered, highly customized, heterogeneous chips for various different workloads, these chips are on 24/7, so consistent uptime is critical,” said Dan Lee, product management director at Cadence. “Since all of these chips are done at the really advanced nodes, with the smaller device sizes, more developers are looking to do aging analysis, and derive the wear and tear so they can see if the chip is going to last a year or five years. At the same time, an important consideration is also thermal — especially when we’re talking about these heterogeneous integrations, and you don’t really get the thermal conductivity that you would in a straightforward, monolithic design. There’s a bit more thought or planning that needs to be a part of this because aging and heating are related. All things being equal, if you’re operating in a very hot environment, you’re going to expect a lower lifespan.”

Still, determining how much shorter that lifespan will be isn’t always a precise calculation. “Data center SoCs that execute mission-critical workloads need to provide scalable visibility, predict problems before they occur, provide deep-dive analyses into problems, and be optimized to increase longevity of investment,” said Padmakumar Karthik, senior technology manager at Arm. “Data center diagnostic patterns are often deployed to measure the health of an SoC post-manufacturing to prevent silent data corruption (SDC) issues. But on-chip sensors provide an additional layer of insights, detecting droops or aging or thermal events on-chip, all of which can cause SDC incidents. For this reason, scalable, customizable sensor frameworks that can monitor and adapt throughout the useful life of the device, enabling continuous design optimization and preventive maintenance, will be increasingly important.”

There are multiple ways to achieve this, but each data center can be very different. In some cases, chips are designed by systems companies for internal use. And in most cases, there is a mix of different hardware and software, not all of which is state-of-the-art. “Many data centers have legacy infrastructure that may not be inherently designed for optimal power efficiency,” noted Noam Brousard, vice president of systems at proteanTecs, in a recent blog. “Upgrading or retrofitting such infrastructure poses challenges in achieving comprehensive power optimization.”

Even within a single rack, stresses can vary greatly from one server to the next, and from one chip to the next even in the same server. “You can imagine when you have a very big chip, toward the edges of the chip it will expand more than in a small chip, and that can add stress,” said Rambus’ Woo. “You have to really be careful about how you cool things, and memory is no different. You have very specific things you worry about with memory, like the ability to retain data, depending on how hot the chip is.”

In addition, as chips age, parameters drift. Marc Swinnen, director of product marketing in Ansys’ semiconductor division, said the traditional approach has been to use a library that’s characterized as a brand new chip. “The library is characterized at 1 year, 5 years, 10 years, 15 years, and you can run all your analysis multiple times with these different aged libraries. That sounds good on paper, and that’s what a lot of people do, but the problem is that not all parts of the chip age at the same rate. This is why aging is often associated with activity and temperature. Some parts of the chip are more active and hotter than other parts of the chip, so the aging time runs differently for different parts. This means you want to apply some of the old library to some parts of the chip, and the younger library to other parts of the chip, because if signals run between them you have setup and hold issues. If everything slows down at the same time — or one slows down and the other one doesn’t — you’re going to get mismatches, and that’s the difficulty. At the bottom level, it’s easy. Every gate is assigned its right age. That’s simple. You do an analysis with every gate. But how do you assign the age to every gate? Where do you get that information from? You need a lot of realistic activity, and then predict that over the lifespan and with temperature. That’s the problem. How do you actually construct this aging map? Once you have it, the analysis is not that hard.”

Aging maps are application- and workload-specific. Every chip will age differently depending on the functions it performs.

But aging is just one of many factors that affect data center uptime. “When we look at data center, we look at the whole application first, then whittle it down to what that means for chips and packages,” said Kelly Morgan, senior principal application engineer at Ansys. “From the mechanical reliability lens of the data center operation, we go through thermal cycling, obviously. We’re in a controlled environment. But what does that influence? How does that influence the integrity of the chips as you go through thermal cycles? Typically, we’ll look at things like solder fatigue and other effects.”

Another factor to consider is shipping and handling, which can affect the aging of a chip, package, and board.

“Even before the device is put in place, there are opportunities for vibration,” Morgan said. “You might hit something, which is a bit of a shock. We have customers who are looking at things like drop, shock, and vibration, and they have goals they need to test to. Typically, the standard process is to do a lot of physical testing. Now as you can imagine, that can be pretty challenging. You have to be pretty far along in the design process before you really start to go and test, and if there’s an issue, then you’ve got to go back and retest. Early simulation helps here, especially for those larger-scale events, and that comes down to the chassis, the board, to all the components, including the ICs.”


Fig. 1: Components of complete electronic system analysis. Source: Ansys

Quality control remains a big challenge when it comes to mechanical stresses that can affect aging. Adam Cron, distinguished architect at Synopsys, pointed to a recent Intel white paper, which noted that at the current acceptable defectivity rates, one core fails every two days. To account for this, Cron noted that certain commercial tools support in-system delay testing in a BiST mode. By adding specific IP, any ATPG patterns could be added to that. (Intel’s paper said its solution only applies to stuck-at testing.)

“In very large, millions-of-cores data center-type environments, the implication is that you’d better be ready,” Cron said. “One of the things they were talking about in this paper was in-system scan. Intel was bringing a database of test patterns in, and then applying it in-system after isolating a core. And then, upon a failure, they’d quarantine and move on. But the data centers are apparently running out of that opportunistic time slot to do any of this. We’ve heard some interesting conversations about the fact that people do run a lot of things during certain times. However, other times are cheaper, so all the holes are just getting filled in terms of runtime. Monitors are certainly something to look at, but monitors are looking at systemic degradation. That’s known, if you will. And so as things degrade, Vmin will change, maybe frequency will change. And they’ll be on a pace. They can figure out when to do that. That’s easy enough to figure out. However, if there’s a marginality or some broken component in there, it is not up to the tool to find that. And frankly, the in-system scan wasn’t addressing all components on the die. It was only up to like 80% of stuck-at coverage, which isn’t that much, especially when you’re not looking at all of the pieces inside the die. The point is, there are still opportunities to do better.”

Cron noted that one big systems company suggested a dual-core lockstep mechanism, starting out the data center in dual-core lock-step mode for X number of months. “When it looks like you’ve squeezed the major part of the curve out, in terms of finding these defective components, then unlock them, double your capacity, run like that for a while, and periodically hook some back up again. That means everything is utilized, at least. Of course, some are working at half capacity here and there, but it’s not the whole die. And there are some implications there from a design standpoint, at least for the hardware, but also possibly the operating system, depending on who decides what physical core is used versus what virtual core is used.”

Approaches to measuring aging
Any discussion around aging circuits really boils down to extending the life of the machines in the data center, and not getting caught by surprise when failures occur.

“How do you do that? You have to measure the aging of those machines,” said Neil Hand, director of marketing, IC segment at Siemens EDA. “Right now, if you speak to the CIOs of these big companies with big data centers, they say, ‘We’ve got to get rid of the machines after three years because we can’t risk it going down.’ If you look at embedded analytics capabilities, you can start to embed aging monitors in those devices, you can start to monitor those in real time. It doesn’t look that different than what it does from an automotive perspective. It’s all the same technologies, effectively, but you’re monitoring them. And then you can say, ‘We’re now at 90% of our life for this server.’ We can then just replace that server.”

This feeds into corporate goals around sustainability, as well. “It comes down to building the best thing to begin with, then building it with design for manufacturing in mind so that you don’t get waste during manufacturing, achieve better yields, and finally extend the life of products and build them in environmentally-sustainable ways,” Hand said. “If you can extend the data center lifecycle from three years to five years, that’s big. And especially if you start going to these high-performance, application-specific type of clusters, you may not need to change them as often, because if the underlying capabilities aren’t changing, that might drive the cycling of it. In the case of a biological computer, if there’s no new change to the underlying protein folding mechanisms, you might say, ‘We don’t need a new compute platform. This is really good.”

The longer the product life can be extended, the better. Design for aging is a matter of, first, performing the aging analysis with the foundry models. “Run the simulations and observe the effects,” said Cadence’s Lee. “When you’re doing the simulation, you want to have the right mission profiles, so you come up with an accurate prediction of how your device is going to behave after a certain number of years in deployment. You may want to combine that with thermal analysis, for example, because how that aging is going to behave will depend on what temperature this design is going to be working at. You may think it’s 22 degrees Celsius, but maybe through some thermal analysis you realize it’s actually going to be operating at 35 or 40 degrees most of the time. That may change the outcome of your aging analysis.”

In terms of the associated thermal analysis, this can extend beyond a single device. “It’s also how that heat is moving,” Lee said. “Let’s say you have this integrated design, where you have some power devices alongside some logic, or some other functionality that is lower power. What you may want to understand is, if those bandgaps or power circuits are generating a lot of heat, that may be shifting over into other parts of your design. So when you run your aging analysis, you may assume that you’re running at 25 degrees, whereas the power devices are at 40 or 45 degrees. They’re on the same chip, they’re very close to each other, and you have to understand how much of that heat is moving over to your logic and what that’s going to bring the temperature up to. You want to know that so you can perform the aging analysis based on that higher temperature.”

Another consideration is combining aging analysis and interconnect parasitics, which is especially relevant for advanced nodes due to the parasitics in the interconnect. “They’re dominant when it comes to performance and functionality,” Lee added. “So when thinking about aging, you also have to think about it being an aged device that has to push the electrons through this interconnect. That’s a pretty heavy load. When you’re doing the aging analysis, you probably will have to be doing it with extracted parasitics. You just can’t do it on a pure schematic design. It doesn’t give you enough detail about what’s really happening physically. This may be included in the aging analysis tool. When most people talk about aging, they may not think about the parasitic aspect to it.”

Combating aging, thermal in memory
While standards don’t work in custom silicon, they do work for some standard components in those devices, such as memory. Over the past 10 to 15 years, memory standards have started to address the impact of heat.

“If you start to exceed certain temperature limits, you’ve got to refresh the device more frequently because the charge can leak off the cells more quickly,” said Rambus’ Woo. “So there are temperature-dependent refresh rates. There are other things that can be exacerbated, like the capacitors are getting smaller, they’re holding fewer electrons because there are so many more of them on a chip now, so we’ve seen memories adopt on-die error correction. This on-die error correction is something that is hidden from the outside world. In many cases, you don’t even know an error has occurred and been corrected on the chip. Those kinds of technologies become even more important now because the temperatures can be higher.”

There also is growing demand for more telemetry to provide monitoring information. “You just want to know if anything is overheating,” said Woo. “Does something seem like it’s malfunctioning? The data center manager will get regular updates about the status of the major components of the system. A lot of boards now in servers have baseboard management controllers (BMCs), which are little chips that sit on each board and are responsible for, among other things, reporting back the health of that board when a server might have five or six boards. We’re frequently seeing more of these BMC chips.”

Design for aging
While the goal is to be able to guarantee a certain lifetime for the chips in a data center, the challenges for achieving that are expanding. “There’s a growing list of things that can be harmful to devices over their lifetime,” Woo said. “It’s a balance between not adding too much cost, even though you have to increase the reliability and maybe add new features, and all of these things are in play with each other.”

Whether it is liquid cooling or higher levels of RAS ECC in the system, there is no single best answer for every application. In general, the industry is moving toward higher reliability and increasing resilience, but there are many ways to get there and challenges with each of them.

“Just as 15 years ago we didn’t necessarily always think we had to talk about power, now we have to talk about it all the time,” Woo said. “The same thing is going to be true for resilience and reliability. It’s going to be required to become part of the way people think about architectures, and part of that is how the memory system improves its reliability. You can’t really do anything unless you can compute on some data, and you have to make sure that data is reliable. It will touch how memory is stored in a DRAM. It will touch how memory is communicated across links. And it even will touch how processors manipulate data once they get a hold of it in their caches, and in the compute pipelines. Also, one of the key things people will worry about is how much of that susceptibility is brought about by age-related issues, like heating cycles, etc.”

Finally, there are even issues around the quality of the power that comes into a system. “The servers get noise on the power rails, and it’s a balance between how much money you’re willing to pay for the power delivery versus the quality of power,” said Woo. “You have to be tolerant of those kinds of things, too. Power management becomes more challenging, as well as the amount of power that these systems are using today. NVIDIA systems bring 48-volt power into the racks, and there is talk about even higher voltage levels. Those changes in infrastructure can all impact heat, and can age components differently.”

The post Chip Aging Becoming Key Factor In Data Center Economics appeared first on Semiconductor Engineering.

Software-Defined Vehicle Momentum Grows

Experts at the Table: The automotive ecosystem is undergoing a transformation toward software-defined vehicles, spurring new architectures with more software. Semiconductor Engineering sat down to discuss the impact of these changes with Suraj Gajendra, vice president of products and solutions in Arm‘s automotive line of business; Chuck Alpert, R&D automotive fellow at Cadence; Steve Spadoni, zone controller and power distribution application manager at Infineon; Rebeca Delgado, chief technology officer and principal AI engineer at Intel Automotive; Cyril Clocher, senior director in the automotive product line for high-performance computing at Renesas; David Fritz, vice president, hybrid and virtual systems at Siemens EDA; and Marc Serughetti, senior director, systems design group at Synopsys. What follows are excerpts of that discussion.

L-R: Arm’s Gajendra, Cadence’s Alpert, Infineon’s Spadoni, Intel’s Delgado, Renesas’ Clocher, Siemens’ Fritz, Synopsys’ Serughetti.

L-R: Arm’s Gajendra, Cadence’s Alpert, Infineon’s Spadoni, Intel’s Delgado, Renesas’ Clocher, Siemens’ Fritz, Synopsys’ Serughetti.

SE: The automotive ecosystem is undergoing a technology evolution the likes of which has not been seen, including the move to software-defined vehicles. To set a baseline for this discussion, what is your definition of an SDV?

Gajendra: A software-defined vehicle is a concept, a trend, an idea, where the whole ecosystem can drive new capabilities and new user experiences into the car, even after it rolls out of the showroom or dealership. It’s a pretty loaded concept. There’s a lot of infrastructure that needs to come together, such as software development in the cloud, seamless deployment of that software development onto the car, the whole deployment of over-the-air updates, and the connectivity. In short, the concept of a software-defined vehicle is expecting a world where we can drive new experiences, new capabilities, and new features into the car throughout its lifetime.

Alpert: In thinking about what SDV means, one example is the battery — especially in an EV. I’m not talking about the technology of the battery that’s evolved, but rather the idea that in the past when you wanted to charge your car in your garage and you were worried about starting a fire, you’d think, ‘No, don’t do that because your whole house could burn down.’ The idea is that in the past, maybe we might put a temperature sensor on the battery, but now we actually have software that can monitor it. It might even have AI to predict if the battery is reaching some state that might cause a fire in the future. You also might have something that connects to the power grid and learns when is a good time to charge, because it’s a low-usage period so it’s cheaper. This is just one part of the car, but you can imagine a whole bunch of software that you want to put on top of it in order to connect to the universe. You need a software-defined vehicle platform in order for this, or in all the other parts of your car, to communicate with the world and provide the best user experience.

Spadoni: Infineon’s definition of a software-defined vehicle is a redefining of architecture — specifically, electrical and electronic architecture, feature allocation, and the entire topology of the vehicle, from power generation and storage to power distribution and high compute. It really means new electrical architectures, and it has consequences for the business model of every OEM and Tier 1 involved. It’s a major change to previous methodologies in the last 30 years.

Delgado: Software-defined vehicle is not just over-the-air updates. It’s truly a new methodology and a new philosophy for how to architect every ingredient of the vehicle to continue to deliver value over time, in which the value is very tightly attached to the software that delivers the user experience. Ultimately, this architecture must enable the different practices on how to deliver this new value over time. What’s very interesting is that these practices of moving to software-defined architecture has been done by many other industries already. Intel has a ton of heritage, and actually helped those industries transform. That transformation is truly what we’re observing here. It’s an incredible opportunity, and possibly a crisis if not done right.

Clocher: To apply an analogy here, the car is the new smartphone. But for us, it’s more than that. I’ve heard about the platform, yes, and it’s the major architecture evolution that we’ll see in the next decade. For us at Renesas, it will be a journey that will take time to enhance the user experience, to generate new revenue streams for the industry as it moves from decentralized to centralized classic compute with zonal architecture. We can apply all those buzzwords to a software-defined vehicle. Those platform will need big computers and heavy complex hardware solutions and this will generate evolutions, upgrades to the car during its entire lifetime, but underneath we know — at least at Renesas, and certainly at some other players and silicon vendors — that this will need a huge amount of hardware resources to manage what we have in mind to deploy this platform.

Fritz: I see software-defined vehicles a bit differently than what’s been mentioned so far. For many years, you’d have the hardware team doing their design, and the software team doing their design, and it all needs to come together. There’s an English natural language discussion about what needs to happen, and as we all know, that never really goes terribly well. In automotive that becomes an integration storm, and it is a nightmare. With the new compute requirements that have been mentioned already, that just compounds the issue. So the way I see this is that we tend, as people who have an engineering background, to dive into how we’re going to do things. We hear ‘software-defined vehicle,’ we immediately think about how to do that. There’s not a lot of thought about why it needs to be done, and what needs to happen. We jump into the ‘how’ too early, and a lot of the discussion here is exemplary of that kind of approach. When I’m looking at software-defined vehicles, I’m looking at why it’s important that the software needs to run effectively on a piece of hardware. And for that hardware, why is it important for it to actually operate properly on the software? Then you can decide how to put together a new methodology that’s going to bring those things together. In the past, it’s been called hardware/software co-design. There have been attempts many times, and as has been mentioned, other industries have made this transition. What’s unique about automotive is that it’s not just one transition that needs to happen. It’s hundreds or thousands of transitions. The ecosystem needs to be turned upside down, which we’re seeing happen right now, and you need to bring all that together. It really is a methodology where you need the tooling, you need the processes, you need the thinking, you need the organizations to change so that they can make this transition in a realistic way. SDV is a huge transition. It is a way for the automotive industry to morph into something that has longevity and can meet customer expectations, which it really hasn’t met for some time now.

Serughetti: At the end of the day, if we look starting at the top from our perspective, SDV is a means to bring and enhance the car experience for the customer. That’s the end result that the OEMs look at, but they look at it from the perspective of how that improves the OEM efficiencies, and how that creates new business opportunities. The way we look at it, and what’s important, is the impact it has on the industry, the impact on the processes, on the methodologies, on the people, on the ecosystem, on the technology. It’s really a transformation of the automotive market that is going to fundamentally change how the industry moves forward and bring the OEM into a world in which they are really looking at how they become efficient in delivering cars, how they bring new features, but at the same time, how they evolve their business as well.

SE: As you’ve all described, SDV requires many inter-dependencies, and the entire ecosystem has to have an understanding of the ‘why,’ which should then lead back to laying out the plan for how to get there. Where does the ecosystem stand today in terms of realizing SDV?

Fritz: OEMs have decided in the last few years that they’ve got to take control of their own destiny. They cannot simply take what the suppliers provide. They need a methodology — like this whole SDV concept, and any tooling necessary to provide that — to push down into their suppliers, such that, ‘Here’s what I need. If you can’t do this for me, I will go find someone that will.’ This is not the old ecosystem that bubbled up from the IP to the Tier 2s, to the Tier 1s, and then to the OEMs, which gave them limited choices to go from. So when I say, “Turn the ecosystem upside down,” that’s what is happening. But every OEM has their own ecosystem, and they’re not all in the same place. Even region-to-region, they can be very different.

Delgado: This is a critical discussion, and effectively where the industry has to eventually settle. The magnitude of the transformation of the ecosystem includes roles in the technology evolution. The silicon content is expected to quadruple over the next few years in the vehicle for defining the in-cabin experience of the end user. At the end of the day, the complexity of the transition of roles is of such magnitude that the proprietary, fragmented, and broken approaches that David articulated are really not going to enable the industry to transform at the speed it requires to deliver and meet the experiences. But more than anything, they are not going to address the actual technology changes necessary to implement and allow for this value delivery mechanism. At the end of the day, this is where Intel really believes collaboration is key, and anybody who wants to participate in this ecosystem must provide scalability — also known as top-to-bottom support of the different product lines that our OEMs and Tier 1s are having to support, versus a broken-up approach on these ever-evolving higher performance and higher performance compute needs. It has to be future-proof, because you’re going to launch the vehicle eventually. So certain hardware has to be future-proofed to a certain affordability envelope, and there has to be a strategy around that. And then the ecosystem and that collaboration must be able to deliver that aggregation. It has to be done with certain anchoring technology that will allow us to deliver that performance. Collaboration is key in the sense that these technologies cannot be single-handedly owned, developed, let alone owned, defined, developed, and integrated by OEMs in silos with a proprietary end-to-end architecture definition. There obviously will be differentiations on the actual implementation, but the technologies at large have to have a sense of reuse, particularly from other verticals that have already done software-defined transformations and then tuned in the right ways toward the automotive requirements.

Spadoni: There are probably a wide variety of implementations. At Infineon, we partner with OEMs and Tier 1s and we see different approaches. For example, General Motors has more of a modular approach that emulates what happened in in the mobile phone space. It seems that Ford has a more pragmatic approach, along with Stellantis, but all of them are facing very similar challenges in that affordability has become a big problem. There are multiple generations of implementations that are going to occur, and you’ll see a striving toward how to pay for this extra hardware. It leads to tradeoffs in implementations of other systems that have to have savings in order for them to afford these vehicles. No one ever goes into a dealership and says, ‘Give me a software-defined vehicle.’ Everyone’s looking for value, and you can see it now with volumes going down. There’s a saturation of people buying at the high level. The OEMs want to get more sales, which means they’ll have to go to the lower-cost-value vehicles, and that’s going to affect the electrical and electronic architectures and the software-defined vehicle.

Clocher: What we’re seeing I would summarize as the impact on the ecosystem. We’re moving to an OEM-centric ecosystem. One size does not fit all, meaning OEMs will have their different tastes, their different definitions of levels of integration they want to have in their software-defined vehicle — especially given more complex tasks that we all have to do, rather than the challenge we have to solve, because we’re not talking about a common umbrella of software-defined vehicle. But it really does mean different implementations and different meanings for OEM A from OEM B. I would fully agree with David and Steve that we are far from having a common understanding of, at least, the market itself. And that’s fine, because this will bring differentiation, and ultimately that’s why a customer will go to Dealership A versus Dealership B. This is what the industry wants to see — continue to differentiate, continue to add value to the ultimate product, which is the car.

Serughetti: The important point in all this is, of course, you’re breaking the model that exists today. That’s one of the big challenges. We used to have Tier 1s that were building boxes, and delivering software. This was a complete black box. When it would go to integration, there were all sorts of problems. And now you’re going to break this? The challenge for the OEM is how they do this. They want to control software, but are they equipped to do this today? We see the problems today that some of the legacy OEMs have in setting up their software organizations, the challenges of CARIAD and all such organizations that are trying to do this. It’s not easy to change those companies. Of course, the new entrants don’t have this problem because they are coming from a brand new design versus the ones that deal with legacy. So for the OEM, it’s about how to take control of the software. What does that mean in terms of the processes, in terms of agile development, digital twins, and all of these technologies everybody’s talking about? The other side is, ‘It’s all nice, this software,’ but this software runs on all the companies that are delivering hardware, and that becomes essential to it. You can have the best software, but if your hardware is not there to support performance, power, and all of those aspects, you’re not going to be successful. So the ecosystem is evolving how hardware, software, and all of this comes together. The OEM wants to be the central point. That’s what we’re talking about in terms of the process methodology aspects that are making this transition evolve.

Gajendra: Where are we in this journey? How far have we come? And where are we going? Going back to the point that David mentioned earlier about supply chain evolving and the supply chain turned upside down, five years ago, if we sat here in this sort of a panel and discussed software-defined vehicles, the conversation would have been entirely different. It would have been stuck with the traditional supply chain that we’ve seen for the last 35 or 40 years in the automotive industry. There are fundamentally two aspects here. The supply chain is evolving, and the infrastructure that we, as a community — this team, for example, and many others in the community — are trying to enable is going to be key to making our EDA partners happy. The use of virtual platforms today in the cloud to try and shift left and develop and validate some of these technologies and software wasn’t even there five years ago, so we’ve come a long way. We’ve made a lot of progress together as an industry. Yes, we have a long way to go until we actually have a truly software-defined vehicle. We can go and ask for a software-defined vehicle in the dealership. But the changes we are seeing in terms of all sorts of technology providers trying to make sure that the technology that we eventually will have in the hardware is provided in some sort of virtual form, be it fast models or whatever it is in the cloud, for the vast majority of software ecosystem in automotive this is a big change. I was at Embedded World, and the amount of virtual platforms and the demos that people were actually showing — silicon partners like we have here, Intel, Renesas, Infineon, EDA companies — pointed to a strong movement of, ‘Let’s build the infrastructure that we can build, and then provide that infrastructure to the OEMs to take it from there.’ There is a lot of work going on. Together we will make the infrastructure across the board, be it virtual platform or others, richer and more capable.

Alpert: For sure, OEMs have to control their own destiny. In the past, they would do it by differentiating maybe because they had better engine performance, or some other feature. But going forward, the differentiation is going to be their software. Whoever can make software that will provide additional value, and brand it, that’s going to be the differentiator and that’s the trend. In terms of how you get there, a shared ecosystem is important. SOAFEE is a potential way that, together with virtual platforms, you can provide a shared ecosystem for development, but still allow everyone to differentiate and plug-and-play. That’s one reason we’re working closely with Arm on trying to have a reference design specifically for this purpose. But again, we’re not saying, ‘This is the design you use. This is how you do it.’ That’s not it. The point is, let’s start somewhere, and then people can start swapping out pieces and doing different things. As long as OEMs can plug-and-play, then they can still differentiate. But they don’t have to invent everything themselves, which would be too costly.

Related Reading
Software-Defined Vehicles Ready To Roll
New approach could have big effects on cost, safety, security, and time to market.

The post Software-Defined Vehicle Momentum Grows appeared first on Semiconductor Engineering.

Software-Defined Vehicle Momentum Grows

Experts at the Table: The automotive ecosystem is undergoing a transformation toward software-defined vehicles, spurring new architectures with more software. Semiconductor Engineering sat down to discuss the impact of these changes with Suraj Gajendra, vice president of products and solutions in Arm‘s automotive line of business; Chuck Alpert, R&D automotive fellow at Cadence; Steve Spadoni, zone controller and power distribution application manager at Infineon; Rebeca Delgado, chief technology officer and principal AI engineer at Intel Automotive; Cyril Clocher, senior director in the automotive product line for high-performance computing at Renesas; David Fritz, vice president, hybrid and virtual systems at Siemens EDA; and Marc Serughetti, senior director, systems design group at Synopsys. What follows are excerpts of that discussion.

L-R: Arm’s Gajendra, Cadence’s Alpert, Infineon’s Spadoni, Intel’s Delgado, Renesas’ Clocher, Siemens’ Fritz, Synopsys’ Serughetti.

L-R: Arm’s Gajendra, Cadence’s Alpert, Infineon’s Spadoni, Intel’s Delgado, Renesas’ Clocher, Siemens’ Fritz, Synopsys’ Serughetti.

SE: The automotive ecosystem is undergoing a technology evolution the likes of which has not been seen, including the move to software-defined vehicles. To set a baseline for this discussion, what is your definition of an SDV?

Gajendra: A software-defined vehicle is a concept, a trend, an idea, where the whole ecosystem can drive new capabilities and new user experiences into the car, even after it rolls out of the showroom or dealership. It’s a pretty loaded concept. There’s a lot of infrastructure that needs to come together, such as software development in the cloud, seamless deployment of that software development onto the car, the whole deployment of over-the-air updates, and the connectivity. In short, the concept of a software-defined vehicle is expecting a world where we can drive new experiences, new capabilities, and new features into the car throughout its lifetime.

Alpert: In thinking about what SDV means, one example is the battery — especially in an EV. I’m not talking about the technology of the battery that’s evolved, but rather the idea that in the past when you wanted to charge your car in your garage and you were worried about starting a fire, you’d think, ‘No, don’t do that because your whole house could burn down.’ The idea is that in the past, maybe we might put a temperature sensor on the battery, but now we actually have software that can monitor it. It might even have AI to predict if the battery is reaching some state that might cause a fire in the future. You also might have something that connects to the power grid and learns when is a good time to charge, because it’s a low-usage period so it’s cheaper. This is just one part of the car, but you can imagine a whole bunch of software that you want to put on top of it in order to connect to the universe. You need a software-defined vehicle platform in order for this, or in all the other parts of your car, to communicate with the world and provide the best user experience.

Spadoni: Infineon’s definition of a software-defined vehicle is a redefining of architecture — specifically, electrical and electronic architecture, feature allocation, and the entire topology of the vehicle, from power generation and storage to power distribution and high compute. It really means new electrical architectures, and it has consequences for the business model of every OEM and Tier 1 involved. It’s a major change to previous methodologies in the last 30 years.

Delgado: Software-defined vehicle is not just over-the-air updates. It’s truly a new methodology and a new philosophy for how to architect every ingredient of the vehicle to continue to deliver value over time, in which the value is very tightly attached to the software that delivers the user experience. Ultimately, this architecture must enable the different practices on how to deliver this new value over time. What’s very interesting is that these practices of moving to software-defined architecture has been done by many other industries already. Intel has a ton of heritage, and actually helped those industries transform. That transformation is truly what we’re observing here. It’s an incredible opportunity, and possibly a crisis if not done right.

Clocher: To apply an analogy here, the car is the new smartphone. But for us, it’s more than that. I’ve heard about the platform, yes, and it’s the major architecture evolution that we’ll see in the next decade. For us at Renesas, it will be a journey that will take time to enhance the user experience, to generate new revenue streams for the industry as it moves from decentralized to centralized classic compute with zonal architecture. We can apply all those buzzwords to a software-defined vehicle. Those platform will need big computers and heavy complex hardware solutions and this will generate evolutions, upgrades to the car during its entire lifetime, but underneath we know — at least at Renesas, and certainly at some other players and silicon vendors — that this will need a huge amount of hardware resources to manage what we have in mind to deploy this platform.

Fritz: I see software-defined vehicles a bit differently than what’s been mentioned so far. For many years, you’d have the hardware team doing their design, and the software team doing their design, and it all needs to come together. There’s an English natural language discussion about what needs to happen, and as we all know, that never really goes terribly well. In automotive that becomes an integration storm, and it is a nightmare. With the new compute requirements that have been mentioned already, that just compounds the issue. So the way I see this is that we tend, as people who have an engineering background, to dive into how we’re going to do things. We hear ‘software-defined vehicle,’ we immediately think about how to do that. There’s not a lot of thought about why it needs to be done, and what needs to happen. We jump into the ‘how’ too early, and a lot of the discussion here is exemplary of that kind of approach. When I’m looking at software-defined vehicles, I’m looking at why it’s important that the software needs to run effectively on a piece of hardware. And for that hardware, why is it important for it to actually operate properly on the software? Then you can decide how to put together a new methodology that’s going to bring those things together. In the past, it’s been called hardware/software co-design. There have been attempts many times, and as has been mentioned, other industries have made this transition. What’s unique about automotive is that it’s not just one transition that needs to happen. It’s hundreds or thousands of transitions. The ecosystem needs to be turned upside down, which we’re seeing happen right now, and you need to bring all that together. It really is a methodology where you need the tooling, you need the processes, you need the thinking, you need the organizations to change so that they can make this transition in a realistic way. SDV is a huge transition. It is a way for the automotive industry to morph into something that has longevity and can meet customer expectations, which it really hasn’t met for some time now.

Serughetti: At the end of the day, if we look starting at the top from our perspective, SDV is a means to bring and enhance the car experience for the customer. That’s the end result that the OEMs look at, but they look at it from the perspective of how that improves the OEM efficiencies, and how that creates new business opportunities. The way we look at it, and what’s important, is the impact it has on the industry, the impact on the processes, on the methodologies, on the people, on the ecosystem, on the technology. It’s really a transformation of the automotive market that is going to fundamentally change how the industry moves forward and bring the OEM into a world in which they are really looking at how they become efficient in delivering cars, how they bring new features, but at the same time, how they evolve their business as well.

SE: As you’ve all described, SDV requires many inter-dependencies, and the entire ecosystem has to have an understanding of the ‘why,’ which should then lead back to laying out the plan for how to get there. Where does the ecosystem stand today in terms of realizing SDV?

Fritz: OEMs have decided in the last few years that they’ve got to take control of their own destiny. They cannot simply take what the suppliers provide. They need a methodology — like this whole SDV concept, and any tooling necessary to provide that — to push down into their suppliers, such that, ‘Here’s what I need. If you can’t do this for me, I will go find someone that will.’ This is not the old ecosystem that bubbled up from the IP to the Tier 2s, to the Tier 1s, and then to the OEMs, which gave them limited choices to go from. So when I say, “Turn the ecosystem upside down,” that’s what is happening. But every OEM has their own ecosystem, and they’re not all in the same place. Even region-to-region, they can be very different.

Delgado: This is a critical discussion, and effectively where the industry has to eventually settle. The magnitude of the transformation of the ecosystem includes roles in the technology evolution. The silicon content is expected to quadruple over the next few years in the vehicle for defining the in-cabin experience of the end user. At the end of the day, the complexity of the transition of roles is of such magnitude that the proprietary, fragmented, and broken approaches that David articulated are really not going to enable the industry to transform at the speed it requires to deliver and meet the experiences. But more than anything, they are not going to address the actual technology changes necessary to implement and allow for this value delivery mechanism. At the end of the day, this is where Intel really believes collaboration is key, and anybody who wants to participate in this ecosystem must provide scalability — also known as top-to-bottom support of the different product lines that our OEMs and Tier 1s are having to support, versus a broken-up approach on these ever-evolving higher performance and higher performance compute needs. It has to be future-proof, because you’re going to launch the vehicle eventually. So certain hardware has to be future-proofed to a certain affordability envelope, and there has to be a strategy around that. And then the ecosystem and that collaboration must be able to deliver that aggregation. It has to be done with certain anchoring technology that will allow us to deliver that performance. Collaboration is key in the sense that these technologies cannot be single-handedly owned, developed, let alone owned, defined, developed, and integrated by OEMs in silos with a proprietary end-to-end architecture definition. There obviously will be differentiations on the actual implementation, but the technologies at large have to have a sense of reuse, particularly from other verticals that have already done software-defined transformations and then tuned in the right ways toward the automotive requirements.

Spadoni: There are probably a wide variety of implementations. At Infineon, we partner with OEMs and Tier 1s and we see different approaches. For example, General Motors has more of a modular approach that emulates what happened in in the mobile phone space. It seems that Ford has a more pragmatic approach, along with Stellantis, but all of them are facing very similar challenges in that affordability has become a big problem. There are multiple generations of implementations that are going to occur, and you’ll see a striving toward how to pay for this extra hardware. It leads to tradeoffs in implementations of other systems that have to have savings in order for them to afford these vehicles. No one ever goes into a dealership and says, ‘Give me a software-defined vehicle.’ Everyone’s looking for value, and you can see it now with volumes going down. There’s a saturation of people buying at the high level. The OEMs want to get more sales, which means they’ll have to go to the lower-cost-value vehicles, and that’s going to affect the electrical and electronic architectures and the software-defined vehicle.

Clocher: What we’re seeing I would summarize as the impact on the ecosystem. We’re moving to an OEM-centric ecosystem. One size does not fit all, meaning OEMs will have their different tastes, their different definitions of levels of integration they want to have in their software-defined vehicle — especially given more complex tasks that we all have to do, rather than the challenge we have to solve, because we’re not talking about a common umbrella of software-defined vehicle. But it really does mean different implementations and different meanings for OEM A from OEM B. I would fully agree with David and Steve that we are far from having a common understanding of, at least, the market itself. And that’s fine, because this will bring differentiation, and ultimately that’s why a customer will go to Dealership A versus Dealership B. This is what the industry wants to see — continue to differentiate, continue to add value to the ultimate product, which is the car.

Serughetti: The important point in all this is, of course, you’re breaking the model that exists today. That’s one of the big challenges. We used to have Tier 1s that were building boxes, and delivering software. This was a complete black box. When it would go to integration, there were all sorts of problems. And now you’re going to break this? The challenge for the OEM is how they do this. They want to control software, but are they equipped to do this today? We see the problems today that some of the legacy OEMs have in setting up their software organizations, the challenges of CARIAD and all such organizations that are trying to do this. It’s not easy to change those companies. Of course, the new entrants don’t have this problem because they are coming from a brand new design versus the ones that deal with legacy. So for the OEM, it’s about how to take control of the software. What does that mean in terms of the processes, in terms of agile development, digital twins, and all of these technologies everybody’s talking about? The other side is, ‘It’s all nice, this software,’ but this software runs on all the companies that are delivering hardware, and that becomes essential to it. You can have the best software, but if your hardware is not there to support performance, power, and all of those aspects, you’re not going to be successful. So the ecosystem is evolving how hardware, software, and all of this comes together. The OEM wants to be the central point. That’s what we’re talking about in terms of the process methodology aspects that are making this transition evolve.

Gajendra: Where are we in this journey? How far have we come? And where are we going? Going back to the point that David mentioned earlier about supply chain evolving and the supply chain turned upside down, five years ago, if we sat here in this sort of a panel and discussed software-defined vehicles, the conversation would have been entirely different. It would have been stuck with the traditional supply chain that we’ve seen for the last 35 or 40 years in the automotive industry. There are fundamentally two aspects here. The supply chain is evolving, and the infrastructure that we, as a community — this team, for example, and many others in the community — are trying to enable is going to be key to making our EDA partners happy. The use of virtual platforms today in the cloud to try and shift left and develop and validate some of these technologies and software wasn’t even there five years ago, so we’ve come a long way. We’ve made a lot of progress together as an industry. Yes, we have a long way to go until we actually have a truly software-defined vehicle. We can go and ask for a software-defined vehicle in the dealership. But the changes we are seeing in terms of all sorts of technology providers trying to make sure that the technology that we eventually will have in the hardware is provided in some sort of virtual form, be it fast models or whatever it is in the cloud, for the vast majority of software ecosystem in automotive this is a big change. I was at Embedded World, and the amount of virtual platforms and the demos that people were actually showing — silicon partners like we have here, Intel, Renesas, Infineon, EDA companies — pointed to a strong movement of, ‘Let’s build the infrastructure that we can build, and then provide that infrastructure to the OEMs to take it from there.’ There is a lot of work going on. Together we will make the infrastructure across the board, be it virtual platform or others, richer and more capable.

Alpert: For sure, OEMs have to control their own destiny. In the past, they would do it by differentiating maybe because they had better engine performance, or some other feature. But going forward, the differentiation is going to be their software. Whoever can make software that will provide additional value, and brand it, that’s going to be the differentiator and that’s the trend. In terms of how you get there, a shared ecosystem is important. SOAFEE is a potential way that, together with virtual platforms, you can provide a shared ecosystem for development, but still allow everyone to differentiate and plug-and-play. That’s one reason we’re working closely with Arm on trying to have a reference design specifically for this purpose. But again, we’re not saying, ‘This is the design you use. This is how you do it.’ That’s not it. The point is, let’s start somewhere, and then people can start swapping out pieces and doing different things. As long as OEMs can plug-and-play, then they can still differentiate. But they don’t have to invent everything themselves, which would be too costly.

Related Reading
Software-Defined Vehicles Ready To Roll
New approach could have big effects on cost, safety, security, and time to market.

The post Software-Defined Vehicle Momentum Grows appeared first on Semiconductor Engineering.

V2X Path To Deployment Still Murky

Experts at the Table: Semiconductor Engineering sat down to discuss Vehicle-To-Everything (V2X) technology and the path to deployment, with Shawn Carpenter, program director for 5G and space at Ansys; Lang Lin, principal product manager at Ansys; Daniel Dalpiaz, senior manager product marketing, Americas, green industrial power division at Infineon; David Fritz, vice president of virtual and hybrid systems at Siemens EDA; and Ron DiGiuseppe, senior marketing manager, automotive IP segment at Synopsys. What follows are excerpts from that conversation.

L-R: Ansys' Carpenter; Ansys' Lin; Infineon’s Dalpiaz; Siemens EDA’s Fritz; Synopsys‘ DiGiuseppe.

L-R: Ansys’ Carpenter; Ansys’ Lin; Infineon’s Dalpiaz; Siemens EDA’s Fritz; Synopsys‘ DiGiuseppe.

SE: What is the potential of vehicle-to-everything technology, and what role will the semiconductor ecosystem play in making this a reality?

DiGiuseppe: V2X is a technology that’s not just years, but decades, in the making. It initially started as a dedicated short-range communications (DSRC) type of technology, and has globally transitioned into a cellular technology, although many of those V2X applications are not just cellular. There are other spectrum allocations V2X can run on, including WiFi or other general-use technology. So it’s not limited to cellular. Also, it’s not just a technology. It’s an application, an outcome, and there are a lot of valuable uses, many of which are safety-related, but there are others, such as efficiency of traffic management notifications. V2X has a wide number of uses. The deployment will be done in stages, and there’s a lot of activity even though it’s taken a long time.

Lin: When I see the keyword V2X, it reminds me of everything about how the car can communicate with anything in the world. It’s a very exciting moment that we’re here today to be able to make some kind of technology to enable great communication between vehicles and people, in network infrastructures and car to car communications. Today, there is already something implemented. For instance, in car network systems we can connect our phone to the car already, but we’re still in the first mile. We’ve started on the journey, but we have a long way to go as far as how to connect car-to-car, how to connect the car to the entire infrastructure of networks, and to the internet. There are a lot of unknowns on the road while we start driving on this journey, and safety and security are definitely the biggest concerns. What if my network is being jeopardized?

Dalpiaz: V2X is part of a much bigger smart grid ecosystem. This will certainly play a very important role, especially as the grid becomes smart and decentralized. This is what will enable the future energy ecosystem, having renewable energies, energy storage systems all connected. And as we see more EVs being used as mobile battery storage. this is something that will certainly enable, and is part of, a smart grid ecosystem that everybody’s talking about.

Fritz: The days of independent semiconductor and software development are over. It is the need for OEMs to control their own destiny, driven by growing consumer and competitive demand, that has all but eliminated the ability to sell a one-size-fits-all product. We’ve known for a very long time that software needs to drive semiconductors, and semiconductors need to drive software. This symbiotic relationship, and the tools and methodologies needed to support this paradigm shift, are essential to producing a highly successful, complex, and competitive solution that meets consumer demands.

SE: What are the discrete pieces of V2X that need to be connected?

Dalpiaz: From the semiconductor point of view, especially with the usage of wide bandgap materials, a few companies are seeing that it’s possible to increase efficiency and power density. Being able to not only provide such solutions, but have everything connected in one box, is part of the smart ecosystem. Then, having the electric vehicles, energy storage, solar — everything combined into one box. Twenty years ago, before the iPhone, we used to have a fax machine, a camera for photographs, a computer. The future of this ecosystem is going to have one box sitting in your home, and have all this stuff connected together. So from the semiconductor point of view, especially with silicon carbide, it is something that is possible today, and it can achieve a very high level of efficiency — about 99%, very close to 100%. And of course, we need to make the system smaller to fit in a vehicle.

DiGiuseppe: One of the key stakeholders is the cellular companies. When we look at cellular V2X, one of the main challenges is interoperability. You have different devices in different model-year cars, so for the vehicle-to-vehicle communications, those different devices need to be interoperable. Then, the car will be talking to the infrastructure, so the roadside units need to be interoperable with the cars and devices in the cars. Then, of course, you have vehicle-to-pedestrians, vehicle-to-e-mobility like vehicle-to-bicycles, vehicle-to-motorcycles interoperability between all the devices over the medium. Whether it’s cellular or Wi-Fi or other technologies, it all needs to be interoperable. That will allow deployments in one locality to work in another locality, because even if they’re interoperable in one deployment in one region, we’ve got to make sure they’re also interoperable in other regions. So it’s a large scale interoperability goal.

Lin: Ron, you’re talking about interoperability, and Daniel talked about the ecosystem. From my side, I would also mention some standards are necessary. For EDA, to help build such an ecosystem and chips, we need some rules to give to engineers as to what’s to be followed. There are two important standards in my mind. One is the vehicle safety standard ISO 26262, which regulates a couple of safety standards for on-road vehicle chip design. Another is the cyber security standard, ISO 21434. If I make a tool, I probably will follow those standards, and then think about how the tool could help users decide a pass/failure criteria regarding their design, making sure to meet the security and safety target from the standard.

DiGiuseppe: In addition to standards, last October the U.S. Department of Transportation released its national V2X deployment plan. That plan, which is still in draft feedback stage, lays out — at least in the U.S. — the whole timeline for deployments. That kind of oversight plan overlays onto the standards that Lang was just talking about. That deployment plan outlines the different contributions from all the different stakeholders, from the automakers/OEMs to the software developers for the applications. So overlaid on top of standards is a deployment plan, and a government deployment plan outlines that. Plus, there are a lot of government stakeholders, like the FCC allocating spectrum, and the Department of Transportation deploying all these deployments, and that’s in addition to the technology providers.

Fritz: It would take days to adequately answer those questions, but at the core, the root design components are connectivity, power, performance, and acceleration. Connectivity with the proper protocols allows computational tasks to be distributed. This is particularly important in automotive, where the physical distance between sensing, actuating and computing nodes is critical for predictable performance. In the case of V2X, connectivity enables the normalization of external data, whether it involves smart city infrastructure or another vehicle. It’s important to note that the form of the shared data grows exponentially with the capacity to describe the environment, and therefore the compute requirements to process and understand it. For example, a data form that can describe signage in the U.S. is relatively small, but one that is universal with variations recognizable is much larger and more ambiguous. This drives design parameters that directly impact manufacturing, development, and service cost functions. Further, the normalization of the data has an impact on the overall design and design component interactions. In the case of power, it goes without saying that high compute requirements, and the associated necessary cooling, can have a significant impact on EV range and manufacturing costs. Performance can take many forms, but as software loads increase with hypervisors, specialized operating systems, and protocol stacks, not to mention very complex application software, all must meet stringent mission critical requirements. Finally, acceleration is of growing importance because it allows workloads to be handed off to specialized hardware that is better equipped to handle that load. An example is running AI inferencing on a CPU is typically far slower and more power-hungry than on an NPU, but a GPU could be idle and available to do the same task. On the other hand, a small CNN can be handled quite easily on a CPU with a few simple instructions. It is at the intersection of these major design components where an OEM will find its differentiation. Therefore, having a system capable of exploring this complex hardware and software space quickly, and with a small team, is critical for an OEM to demand of its suppliers what is required for the success of its platforms. Again, controlling your own destiny is essential to survival.

SE: With all of this interoperability, what happens when there are parts of the ‘everything’ — whether it’s the car or the infrastructure or pedestrians — that are not updated with the latest technologies or different aspects of what needs to be there for conductivity?

DiGiuseppe: In addition to that challenge, this includes backward compatibility for automotive. For someone buying a car in 2025, you would expect any V2X technology to work in 2040. But in the meantime, all those standards that we’re talking about are continuing to evolve, so they need to be backward compatible.

Carpenter: This highlights the need for a digital twin capability for modeling this infrastructure to be able to understand that when we get two years down the road, some devices may not be reprogrammable. We may not be able to flash a particular device. We need to be able to look at that, and be able to simulate that in advance to understand what will happen. What will this do? We’re seeing this show a little bit, even giving a nod to what Ron was talking about earlier with interoperability. We have customers who want to be able to validate real hardware stuff that they’re developing on the lab bench, but they want to do it with the fidelity of a real system operating on a car, in a virtual city, with the live interaction of the channel with a gNodeB 5G base station mounted up on a building someplace, and they want to know how this will work in the context of the situation that it’s supposed to serve. And if something goes wrong in that scene, can we introduce something into this device and run our real silicon development platform against it to understand what happens here. If we go into a deep shadow, a deep fade area, and I’m not getting updates, yet I’m hurtling down the road at a certain speed, how long can I do this before I receive corrective information? What if someone’s software deck out there doesn’t get reprogrammed or doesn’t get the latest version of the standard safety protocols or something like that? We’re going to need this ability to carry models of stuff that was built two or three years ago in today’s infrastructure, model that, and understand in advance what’s going to happen with it so that we have an approach to do this. This is what the Department of Defense is doing today with their digital thread enablement, to have a way to capture that with legacy models of what they built years ago, but apply it in modern missions and understand, ‘Does it work? Does it fit? Does it not fit? What do we need to do to the existing system to make sure that we’re safe here?’ That is an approach we clearly see the automakers beginning to look at as a way to future-proof some of these systems and make sure that they’ve got a way to test them as they go forward.

Fritz: It’s become very clear from several popularized incidents that simply stopping and waiting for tech support to find you and get you going again is not going to be a successful strategy. In the end, the vehicle must make decisions at least as thoughtful as an average human would make. This is entirely possible, but not if too much emphasis is placed in the design phase on the dependencies between communicating (or non-communicating) actors. For this reason, we will always require sophisticated decision-making in-vehicle to be widely accepted.

SE: How does the design team stay up to date with everything?

DiGiuseppe: On the vehicle side, they’re going to be relying on over-the-air (OTA) software updates, which is relatively new in the automotive industry. But clearly, once we identify a software update, we’re going to need to roll out that software update, and OTA is obviously going to be used hand-in-hand with the updates to V2X as it moves forward.

SE: From a developer standpoint, they have to design to these all these regulations. What are the issues here?

Lin: As a software developer, if you think about a vehicle 10 years ago, you mainly just replaced hardware. You replaced your brakes, you replaced your engine, adding some fluid. These are all old styles. Right now, if you have the V2X network, you’d expect probably daily updates because software is evolving daily, and your whole communication system infrastructure is under the whole internet evolution, so you’re going to have to keep pace with it. That’s a lot of work for developers.

Carpenter: There could be implications on edge processing. The telecommunications providers are going to need to put a lot more compute closer to the radio head, and clearly they’re already exploring the possibilities of getting not just central processing cores, CPU cores, but there will be GPU cores and Tensor Processing Units, and we don’t know what all yet for AI, that will be a part of this safety infrastructure and information/infotainment delivery. There’s a lot more compute that’s going to have to happen with a much shorter latency. Augmented reality with heads-up displays — imagine the possibilities coming in safety systems with heads-up displays in cars. Then imagine the amount of processing that it’s going to take. So the telecom providers will need to be a major part of that, together with most of the local government regulatory groups that are going to foster that safety system. Each municipality probably has to decide what do they adopt, what level of standard will they use, and deliver. Who invests in that? The future is really exciting, but there are a few things yet to be sorted out in terms of the investment needed to really deliver that promise.

Dalpiaz: I’m more in the infrastructure side, and one of the questions we always have is, ‘With all this focus on renewables and decentralization of the grid, can the grid handle such expectations or such projects?’ Having more people connecting and feeding energy back into the grid, and managing all of this, that’s always the question that you have to go through and consider.

Fritz: The fact is that keeping up to date is not practical. However, that doesn’t mean that a methodology cannot be employed to accept changes into the development system, and therefore be folded into the development process. CI/CD systems with digital twin golden models already are being developed, with nightly regressions run against complex (and possibly changing) requirements. In this way, requirement changes are automatically addressed as they occur, and solutions can be rolled into an Agile methodology through nightly regressions. This is an important benefit of a modern development methodology that has been used in other industries for years, but it’s just now finding purchase in progressive automotive companies.

Related Reading
Growing Challenges For Increasingly Connected Vehicles
OEMs have high expectations for connected vehicles and global growth opportunities, but it’s not that simple.
Software-Defined Vehicles Ready To Roll
New approach could have big effects on cost, safety, security, and time to market.

The post V2X Path To Deployment Still Murky appeared first on Semiconductor Engineering.

Commercial Chiplet Ecosystem May Be A Decade Away

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

L-R: Arteris’ Schirrmeister, Cadence’s Bhatnagar, Expedera’s Karazuba, Keysight’s Slater, Siemens EDA’s Rinebold, and Synopsys’ Posner.

SE: There’s a lot of buzz and activity around every aspect of chiplets today. What is your impression of where the commercial chiplet ecosystem stands today?

Schirrmeister: There’s a lot of interest today in an open chiplet ecosystem, but we are probably still quite a bit away from true openness. The proprietary versions of chiplets are alive and kicking out there. We see them in designs. We vendors are all supporting those to make it reality, like the UCIe proponents, but it will take some time to get to a fully open ecosystem. It’s probably at least three to five years before we get to a PCI Express type exchange environment.

Bhatnagar: The commercial chiplet ecosystem is at a very early stage. Many companies are providing chiplets, are designing them, and they’re shipping products — but they’re still single-vendor products, where the same company is designing all the pieces. I hope that with the advancements the UCIe standard is making, and with more standardization, we eventually can get to a marketplace-like environment for chiplets. We are not there.

Karazuba: The commercialization of homogeneous chiplets is pretty well understood by groups like AMD. But for the commercialization of heterogeneous chiplets, which is chiplets from multiple suppliers, there are still a lot of questions out there about that.

Slater: We participate in a lot of the board discussions, and attend industry events like TSMC’s OIP, and there’s a lot of excitement out there at the moment. I see a lot of even midsize and small customers starting to think about their development plans for what chiplet should be. I do think those that are going to be successful first will be those that are within a singular foundry ecosystem like TSMC’s. Today if you’re selecting your IP, you’ve got a variety of ways to be able to pick and choose which IP, see what’s been taped out before, how successful it’s been so you have a way to manage your risk and your costs as you’re putting things together. What we’ll see in the future will be that now you have a choice. Are you purchasing IP, or are you purchasing chiplets? Crucially, it’s all coming from the same foundry and put together in the same manner. The technical considerations of things like UCIe standard packaging versus advanced packaging, and the analysis tool sets for high-speed simulation, as well as for things like thermal, are going to just become that much more important.

Rinebold: I’ve been doing this about 30 years, so I can date back to some of the very earliest days of multi-chip modules and such. When we talk about the ecosystem, there are plenty of examples out there today where we see HBM and logic getting combined at the interposer level. This works if you believe HBM is a chiplet, and that’s a whole other argument. Some would argue that HBM falls into that category. The idea of a true LEGO, snap-together mix and match of chiplets continues to be aspirational for the mainstream market, but there are some business impediments that need to get addressed. Again, there are exceptions in some of the single-vendor solutions, where it’s more or less homogeneous integration, or an entirely vertically integrated type of environment where single vendors are integrating their own chiplets into some pretty impressive packages.

Posner: Aspirational is the word we use for an open ecosystem. I’m going to be a little bit more of a downer by saying I believe it’s 5 to 10 years out. Is it possible? Absolutely. But the biggest issue we see at the moment is a huge knowledge gap in what that really means. And as technology companies become more educated on really what that means, we’ll find that there will be some acceleration in adoption. But within what we call ‘captive’ — within a single company or a micro-ecosystem — we’re seeing multi-die systems pick up.

SE: Is it possible to define the pieces we have today from a technology point of view, to make a commercial chiplet ecosystem a reality?

Rinebold: What’s encouraging is the development of standards. There’s some adoption. We’ve already mentioned UCIe for some of the die-to-die protocols. Organizations like JEDEC announced the extension of their JEP30 PartModel format into the chiplet ecosystem to incorporate chiplet-style data. Think about this as an electronic data sheet. A lot of this work has been incorporated into the CDX working group under Open Compute. That’s encouraging. There were some comments a little bit earlier about having an open marketplace. I would agree we’re probably 3 to 10 years away from that coming to fruition. The underlying framework and infrastructure is there, but a lot of the licensing and distribution issues have to get resolved before you see any type of broad adoption.

Posner: The infrastructure is available. The EDA tools to create, to package, to analyze, to simulate, to manufacture — those tools are all there. The intellectual property that sits around it, either UCIe or some of the more traditional die-to-die interfaces, all of that’s there. What’s not established are full methodology and flows that lead to interoperability. Everything within captive is possible, but a broader ecosystem, a marketplace, is going to require silicon interoperability, simulation, packaging, all of that. That’s the area that we believe is missing — and still building.

Schirrmeister: Do we know what’s required? We probably can define that reasonably well. If the vision is an open ecosystem with IP on chiplets that you can just plug together like LEGO blocks, then the IP industry informs us of what’s required, and then there are some gaps on top of them. I hear people from the hard-coded IP world talking about the equivalent of PDKs for chiplets, but today’s IP ecosystem and the IP deliverables are informing us it doesn’t work like LEGO blocks yet. We are improving every year. But this whole, ‘I take my whiteboard and then everything just magically functions together’ is not what we have today. We need to think really hard about what the additional challenges are when you disaggregate that into chiplets and protocols. Then you get big systemic issues to deal with, like how do you deal with coherency across chiplets? It was challenging enough to get it done on a chip. Now you potentially have to deal with other partnerships you don’t even own. This is not a captive environment in an open ecosystem. That makes it very challenging, and it creates job security for at least 5 to 10 years.

Bhatnagar: On the technical side, what’s going well is adoption. We can see big companies like Intel, and then of course, IP providers like us and Synopsys. Everybody’s working toward standardizing chiplet integration, and that is working very well. EDA tools are also coming up to support that. But we are still very far from a marketplace because there are many issues that are not sorted out, like licensing and a few other things that need a bit more time.

Slater: The standards bodies and networking groups have excited a lot of people, and we’re getting a broad set of customers that are coming along. And one point I was thinking, is this only for very high-end compute? From the companies that I see presenting in those types of forums, it’s even companies working in automotive or aerospace/defense, planning out their future for the next 10 years or more. In the automotive case, it was a company that was thinking about creating chiplets for internal consumption — so maybe reorganizing how they look at creating many different variations or evolutions of their products, trying to do it as more modular chiplet types of blocks. ‘If we take the microprocessor part of it, would we sell that as a chiplet externally for other customers to integrate together into a bigger design?’ For me, the aha moment was seeing how broad the application would be. I do think that the standards work has been moving very fast, and that’s worked really well. For instance, at Keysight EDA, we just released a chiplet PHY designer. It’s a simulation for the high-speed digital link for UCIe, and that only comes about by having a standard that’s published, so an EDA company can take a look at it and realize what they need to do with it. The EDA tools are ready to handle these kinds of things. And maybe then, on to the last point is, in order to share the IP, in order to ensure that it’s available, database and process management is going to become all the more important. You need to keep track of which chip is made on which process, and be able to make it available inside the company to other potential users of that.

SE: What’s in place today from a business perspective, and what still needs to be worked out?

Karazuba: From a business perspective, speaking strictly of heterogeneous chiplets, I don’t think there’s anything really in place. Let me qualify that by asking, ‘Who is responsible for warranty? Who is responsible for testing? Who is responsible for faults? Who is responsible for supply chain?’ With homogeneous chiplets or monolithic silicon, that’s understood because that’s the way that this industry has been doing business since its inception. But when you talk about chiplets that are coming from multiple suppliers, with multiple IPs — and perhaps different interfaces, made in multiple fabs, then constructed by a third party, put together by a third party, tested by a fourth party, and then shipped — what happens when something goes wrong? Who do you point the finger at? Who do you go to and talk to? If a particular chiplet isn’t functioning as intended, it’s not necessarily that chiplet that’s bad. It may be the interface on another chiplet, or on a hub, whatever it might be. We’re going to get there, but right now that’s not understood. It’s not understood who is going to be responsible for things such as that. Is it the multi-chip module manufacturer, or is it the person buying it? I fear a return to the Wintel issue, where the chipmaker points to the OS maker, which points at the hardware maker, which points at the chipmaker. Understanding of the commercial side is is a real barrier to chiplets being adopted. Granted, the technical is much more difficult than the commercial, but I have no doubt the engineers will get there quicker than the business people.

Rinebold: I completely agree. What are the repercussions, warranty-related issues, things like that? I’d also go one step further. If you look at some of the larger silicon foundries right now, there is some concern about taking third-party wafers into their facilities to integrate in some type of heterogeneous, chiplet-type package. There are a lot of business and logistical issues that have to get addressed first. The technical stuff will happen quickly. It’s just a lot of these licensing- and distribution-type issues that need to get resolved. The other thing I want to back up to involves customers in the defense/industrial space. The trust and traceability and the province tracking of IP is going to be key for them, because they have so much expectation of multi-die or chiplet-type packaging as an alternative to monolithic scaling. Just look at all the government programs out there right now, with RESHAPE [Reshore Ecosystem for Secure Heterogeneous Advanced Packaging Electronics] and NGMM [Next-Generation Microelectronics Manufacturing] and such. They’re all in on this chiplet perspective, but they’re going to require a lot of security measures to understand who has touched the IP, where it comes from, how to you verify that.

Posner: Micro-ecosystems are forming because of all these challenges. If you naively think you can just go pick a die off the shelf and put it into your device, how do you warranty that? Who owns it? These micro-ecosystems are building up to fundamentally sort that out. So within a couple of different companies, be it automotive or high-performance compute, they’ll come to terms that are acceptable across all of them. And it’s these micro-ecosystems that are really going to end up driving open chiplets, and I think it’s going to be an organic type of growth. Chiplets are available for a specific application today, but we have this vision that someone else could use it, and we see that with the multiple modes being built into the dies. One mode is, ‘I’m connecting to myself. It’s a very tight, low-latency link.’ But I have this vision in the future that I’m going to need to have an interface or protocol that is more open and uses standard available stacks, and which can be bought off the shelf and integrated. That’s one side of the logistics. I want to mention two more things. It is possible to do interoperability across nodes. We demonstrated our TSMC N3 UCIe with Intel’s in-house UCIe, all put together on an Intel process. This was two separate companies working together, showing the first physical interoperability, so it’s possible. But going forward that’s still just a small part of the overall effort. In the IP space we’ve lived with an IP model of, ‘Build once, sell many.’ With the chiplet marketplace, unless there is a revenue stream from that chiplet, it will break that model. Companies think, ‘I only have to buy the IP once, and then I’m selling my silicon.’ But the infrastructure, the resources that are required to build all of this does not go away. There has to be money at the end of that tunnel for all of these different companies to be investing.

Schirrmeister: Mick is 100% right, but we may have a definition issue here with what we really mean by an ‘open’ chiplet ecosystem. I have two distinct conversations when I talk to partners and customers. On the one hand, you have board designers who are doing more and more integration, and they look at you with a wrinkled forehead and say, ‘We’ve been doing this for years. What are you talking about?’ It may not have been 3D-IC in the classic sense of all items, but they say, ‘Yeah, there are issues with warranties, and the user figures it out.’ The board people arrive from one side of the equation at chiplets because that’s the next evolution of integration. You need to be very efficient. That’s not what we call an open ecosystem of chiplets here. The idea is that you have this marketplace to mix things up, and you have the economies of scale by selling the same chiplet to multiple people. That’s really what the chip designers are thinking about, and some of them think even further because if you do it all in true 3D-IC fashion, then you actually have to co-design those chiplets in a way, and that’s a whole other dimension that needs to be sorted out. To pick a little bit on the big companies that have board and chip design groups in house, you see this even within the messaging of these companies. You have people who come from the board side, and for them it’s not a solved problem. It always has been challenging, but they’re going to take it to the next level. The chip guys are looking at this from a perspective of one interface, like PCI Express, now being UCIe. And then I think about this because the networks on chip need to become super NoCs across chiplets, which poses its own challenges. And that all needs to work together. But those are really chiplets designed for the purpose of being in a chiplet ecosystem. And to that end, Mick’s estimation of longer than five years is probably correct because those purpose-built chiplets, for the purpose of being in an open ecosystem, have all these challenges the board guys have already been dealing with for quite some time. They’re now ‘just getting smaller’ in the amount of integration they do.

Slater: When you put all these chiplets together and start to do that integration, in what order do you start placing the components down? You don’t want to throw away one very expensive chiplet because there was an issue with one of the smaller cheaper ones. So, there are now a lot of thoughts about how to go about doing almost like unit tests on individual chiplets first, but then you want to do some form of system test as you go along. That’s definitely something we need to think about. On the business front,  who is going to be most interested in purchasing a chiplet-style solution. It comes down to whether you have a yield problem. If your chips are getting to the size where you have yield concerns, then definitely it makes sense to think about using chiplets and breaking it up into smaller pieces. Not everything scales, so why move to the lowest process node when you could purchase something at a different process node that has less risk and costs less to manufacture, and then put it all together. The ones that have been successful so far — the big companies like Intel, AMD — were already driven to that edge. The chips got to a size that couldn’t fit on the reticle. We think about how many companies fit into that category, and that will factor into whether or not the cost and risk is worth it for them.

Bhatnagar: From a business perspective, what is really important is the standardization. Inside of the chiplets is fine, but how it impacts other chiplets around it is important. We would like to be able to make something and sell many copies of it. But if there is no standardization, then either we are taking a gamble by going for one thing and assuming everybody moves to it, or we make multiple versions of the same thing and that adds extra costs. To really justify a business case for any chiplet or, or any sort of IP with the chiplet, the standardization is key for the electrical interconnect, packaging, and all other aspects of a system.

Fig. 1:  A chiplet design. Source: Cadence. 

Related Reading
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.
Proprietary Vs. Commercial Chiplets
Who wins, who loses, and where are the big challenges for multi-vendor heterogeneous integration.

The post Commercial Chiplet Ecosystem May Be A Decade Away appeared first on Semiconductor Engineering.

❌