FreshRSS

Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.
PředevčíremHlavní kanál
  • ✇Semiconductor Engineering
  • DAC Panel Could Spark FireworksBrian Bailey
    Panels can often become love fests. While a title may sound controversial, it turns out that everyone quickly finds that all the panelists agree on the major points. This is sometimes the result of how the panel was put together – the proposal came from one company, and they wanted to get their customers or clients onto the panel. They are unlikely to ask a major competitor to be part of the event. These panels can become livelier if they have a moderator who opens up a panel to audience questio
     

DAC Panel Could Spark Fireworks

30. Květen 2024 v 09:08

Panels can often become love fests. While a title may sound controversial, it turns out that everyone quickly finds that all the panelists agree on the major points. This is sometimes the result of how the panel was put together – the proposal came from one company, and they wanted to get their customers or clients onto the panel. They are unlikely to ask a major competitor to be part of the event.

These panels can become livelier if they have a moderator who opens up a panel to audience questions and they decide to throw the spanner in the works. This tends to happen a lot more in the technical panels, because each researcher, who may have taken a different approach to a problem, wants to introduce the audience to their alternative solution. But the pavilion panels tend to be a little more sedate – in part because nobody wants to burn bridges within such a tight industry.

It is quite common for me to moderate a panel each DAC, and this year is no exception. I will be moderating a technical panel whose title is directly confrontational: “Why Is EDA Playing Catchup to Disruptive Technologies Like AI? What Can We Do to Change This?”

The abstract for the panel talks about EDA having a closed mindset, consistently missing disruptive changes by choosing incremental approaches. I know that when I first read it – when I was invited to be the chair for it – I was immediately up in arms.

Twenty years ago, while working at an EDA company, I attempted to drive such disruptive changes in the verification industry. Several times a year, I would go out and talk to our customers and exchange ideas with them about the problems they were facing. We would present ideas about both incremental and disruptive developments we had underway. The message was always the same. “Can we have the incremental changes yesterday? And we don’t have time to think about the longer-term ideas.” It reminded me of the cartoon where a stone-age person is pulling a cart with square wheels and doesn’t have time to listen to the person offering him round ones.

Even so, we did go ahead and develop some of them, and a few of them did achieve an element of success. But to go from first adopters to more mainstream interests often took 10 years. Even today, many of those are still niche tools, and probably money sinks for the companies that developed them. Examples are high-level synthesis and virtual prototypes, the only two pieces of the whole ESL movement that survived. Still, they believe that long term, the industry will need them. Many other pieces completely fell by the wayside, such as hardware/software co-design. That, however, may start to resurface thanks to RISC-V.

Many of the tools associated with ESL were direct collaborations between EDA companies and researchers. I established a research collaboration program with the University of Washington that looked at multi-abstraction simulation, protocol checking and had elements of system synthesis. The only thing that came out of that was hardware software co-verification. Protocol checking, in the form of VIP, also has become popular, although not directly because of this program. Co-verification had a useful life of about five years before SystemC made the solution obsolete.

Many disruptive innovations actually have come from industry, then were commercialized by EDA companies. SystemC is one example of that. Constrained random verification is another. Portable Stimulus, while still nascent, also was developed within industry. These solutions have an advantage in that they were developed to solve a significant enough problem within the industry that they have broader appeal. There is little that has actually come from academia in recent decades.

The panel title also talks specifically about AI and accuses EDA of being behind already. It is not clear that they are. Thirty years ago, you could go to DAC and see all the new tools and flows that EDA companies were working on. Many of them might be ready within a year or two. But today, EDA companies will make no announcements until at least a few of their customers, that they chose as development partners, have had silicon success.

A typical chip cycle is 18 months. Given that we are beginning to hear about some of these tools today means they may have been in use for a good part of that 18 months. Plus, development of those tools must have started about a year before that. Let’s remember that ChatGPT only came to the fore 18 months ago, and it should be quite obvious why few generative AI products have yet been announced. The fact that there are so many EDA AI announcements would make me think that EDA companies were very quick off the starting blocks.

The panelists are Prith Banerjee – Ansys, who has written a book about disruption; Jan Rabaey – professor in the Graduate School of in the Electrical Engineering and Computer Sciences at the University of California, Berkeley, who also serves as the CTO of the Systems Technology Co-Optimization division at imec; Samir Mittal, corporate VP for Silicon Systems AI at Micron Technology; James Scapa, founder and CEO of Altair; and Charles Alpert, fellow at Cadence Design Systems.

If you are going to be at DAC and have access to the technical program, this 90-minute panel may be worth your time. Wednesday June 26th at 10:30am. Come ready with your questions because I will certainly be opening this panel up to the audience very quickly. While sparks may fly, please try and keep your cool and be respectful.

The post DAC Panel Could Spark Fireworks appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Vision Is Why LLMs Matter On The EdgeBen Gomes
    Large Language Models (LLMs) have taken the world by storm since the 2017 Transformers paper, but pushing them to the edge has proved problematic. Just this year, Google had to revise its plans to roll out Gemini Nano on all new Pixel models — the down-spec’d hardware options proved unable to host the model as part of a positive user experience. But the implementation of language-focused models at the edge is perhaps the wrong metric to look at. If you are forced to host a language-focused model
     

Vision Is Why LLMs Matter On The Edge

Od: Ben Gomes
30. Květen 2024 v 09:05

Large Language Models (LLMs) have taken the world by storm since the 2017 Transformers paper, but pushing them to the edge has proved problematic. Just this year, Google had to revise its plans to roll out Gemini Nano on all new Pixel models — the down-spec’d hardware options proved unable to host the model as part of a positive user experience. But the implementation of language-focused models at the edge is perhaps the wrong metric to look at. If you are forced to host a language-focused model for your phone or car in the cloud, that may be acceptable as an intermediate step in development. Vision applications of AI, on the other hand, are not so flexible: many of them rely on low latency and high dependability. If a vehicle relies on AI to identify that it should not hit the obstacle in front of it, a blip in contacting the server can be fatal. Accordingly, the most important LLMs to fit on the edge are vision models — the models whose purpose is most undermined by the reliance on remote resources.

“Large Language Models” can be an imprecise term, so it is worth defining. The original 2017 Transformer LLM that many see as kickstarting the AI rush was 215 million parameters. BERT was giant for its time (2018) at 335 million parameters. Both of these models might be relabeled as “Small Language Models” by some today to distinguish from models like GPT4 and Gemini Ultra with as much as 1.7 trillion parameters, but for the purposes here, all fall under the LLM category. All of these are language models though, so why does it matter for vision? The trick here is that language is an abstract system of deriving meaning from a structured ordering of arbitrary objects. There is no “correct” association of meaning and form in language which we could base these models on. Accordingly, these arbitrary units are substitutable — nothing forces architecture developed for language to only be applied to language, and all the language objects are converted to multidimensional vectors anyway. LLM architecture is thus highly generalizable, and typically retains the core strength from having been developed for language: a strong ability to carry through semantic information. Thus, when we talk about LLMs at the edge, it can be a language model cross-trained on image data, or it might be a vision-only model which is built on the foundation of technology designed for language. At the software and hardware levels, for bringing models to the edge, this distinction makes little difference.

Vision LLMs on the edge flexibly apply across many different use cases, but key applications where they show the greatest advantages are: embodied agents (an especially striking example of the benefits of cross-training embodied agents on language data can be seen with Dynalang’s advantages over DreamerV3 in interpreting the world due to superior semantic parsing), inpainting (as seen with the latent diffusion models), LINGO-2’s decision-making abilities in self-driving vehicles, context-aware security (such as ViViT), information extraction (Gemini’s ability to find and report data from video), and user assistance (physician aids, driver assist, etc). Specifically notable and exciting here is the ability for Vision LLMs to leverage language as a lossy storage and abstraction of visual data for decision-making algorithms to then interact with — especially as seen in LINGO-2 and Dynalang. Many of these vision-oriented LLMs depend on edge deployment to realize their value, and they benefit from the work that has already been done for optimizing language-oriented LLMs. Despite this, vision LLMs are still struggling for edge deployment just as the language-oriented models are. The improvements for edge deployments come in three classes: model architecture, system resource utilization, and hardware optimization. We will briefly review the first two and look more closely at the third since it often gets the least attention.

Model architecture optimizations include the optimizations that must be made at the model level: “distilling” models to create leaner imitators, restructuring where models spend their resource budget (such as the redistribution of transformer modules in Stable Diffusion XL) and pursuing alternate architectures (state-space models, H3 modules, etc.) to escape the quadratically scaling costs of transformers.

System resource optimizations are all the things that can be done in software to an already complete model. Quantization (to INT8, INT4, or even INT2) is a common focus here for both latency and memory burden, but of course compromises accuracy. Speculative decoding can improve utilization and latency. And of course, tiling, such as seen with FlashAttention, has become near-ubiquitous for improving utilization and latency.

Finally, there are hardware optimizations. The first option here is a general-purpose GPU, TPU, NPU or similar, but those tend to be best suited for settings where capability is needed without demanding streamlined optimization such as might be the case on a home computer. Custom hardware, such as purpose-built NPUs, generally has the advantage when the application is especially sensitive to latency or resource consumption, and this covers much of the applications for vision LLMs.

Exploring this trade-off further: Stable Diffusion’s architecture and resource demands have been discussed here before, but it is worth circling back to it as an example of why hardware solutions are so important in this space. Using Stable Diffusion 1.5 for simplicity, let us focus specifically on the U-Net component of the model. In this diagram, you can see the rough construction of the model: it downsamples repeatedly on the left until it hits the bottom of the U, and then upsamples up the right side, bringing back in residual connections from the left at each stage.

This U-Net implementation has 865 million parameters and entails 750 billion operations. The parameters are a fair proxy for the memory burden, and the operations are a direct representation of the compute demands. The distribution of these burdens on resources is not even however. If we plot the parameters and operations for each layer, a clear picture emerges:

These graphs show a model that is destined for gross inefficiencies at every step. Most of the memory burden peaks in the center, whereas the compute is heavily taxed at the two tails but underutilized in the center. These inefficiencies come with costs. The memory peak can overwhelm on-chip storage, thus incurring I/O operations, or else requiring a large excess of unused memory for most of the graph. Similarly, storing residuals for later incurs I/O latency and higher power draws. The underutilization of the compute power at the center of the graph means that the processor will have wasteful power draw as it cannot use the tail of the power curve as it does sparser operations. While software interventions can also help here, this is exactly the kind of problem that custom hardware solutions are meant to address. Custom silicon tailored to the model can let you offload some of that memory burden into additional compute cycles at the center of the graph without incurring extra I/O operations by recomputing the residual connections instead of kicking them out to memory. In doing so, the total required memory drops, and the processor can remain at full utilization. Rightsizing the resource allotment and finding ways to redistribute the burdens are key components to how these models can be best deployed at the edge.

Despite their name, LLMs are important to the vision domain for their flexibility in handling different inputs and their strength at interpreting meaning in images. Whether used for embodied agents, context-aware security, or user assistance, their use at the edge requires a dependable low latency which precludes cloud-based solutions, in contrast to other AI applications on edge devices. Bringing them successfully to the edge asks for optimizations at every level, and we have seen already some of the possibilities at the hardware level. Conveniently, the common architecture with language-oriented LLMs means that many of the solutions needed to bring these most essential models to the edge in turn may also generalize back to the language-oriented models which donated the architecture in the first place.

The post Vision Is Why LLMs Matter On The Edge appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Turbocharging Cost-Conscious SoCs With CacheJohn Min
    Some design teams creating system-on-chip (SoC) devices are fortunate to work with the latest and greatest technology nodes coupled with a largely unconstrained budget for acquiring intellectual property (IP) blocks from trusted third-party vendors. However, many engineers are not so privileged. For every “spare no expense” project, there are a thousand “do the best you can with a limited budget” counterparts. One way to squeeze the most performance out of lower-cost, earlier generation, mid-ran
     

Turbocharging Cost-Conscious SoCs With Cache

Od: John Min
30. Květen 2024 v 09:04

Some design teams creating system-on-chip (SoC) devices are fortunate to work with the latest and greatest technology nodes coupled with a largely unconstrained budget for acquiring intellectual property (IP) blocks from trusted third-party vendors. However, many engineers are not so privileged. For every “spare no expense” project, there are a thousand “do the best you can with a limited budget” counterparts.

One way to squeeze the most performance out of lower-cost, earlier generation, mid-range processor and accelerator cores is to employ the judicious application of caches.

Cutting costs

A simplified example of a typical cost-conscious SoC scenario is illustrated in figure 1. Although the SoC may be composed of many IPs, only three are shown here for clarity.

Fig. 1: Portion of a cost-conscious, non-cache-coherent SoC. (Source: Arteris)

The predominant technology for connecting the IPs inside an SoC is network-on-chip (NoC) interconnect IP. This may be thought of as an IP that spans the entire device. The example shown in figure 1 may be assumed to reflect a non-cache-coherent scenario. In this case, any coherency requirements will be handled in software.

Let’s assume the SoC’s clock is running at 1GHz. Suppose a central processing unit (CPU) based on a reduced instruction set computer (RISC) architecture running a typical instruction will consume a single clock cycle. However, access to external DRAM memory can take anywhere between 100 and 200 processor clock cycles (we’ll average this out to be 150 cycles for the purposes of this article). This means that if the CPU lacked a Level 1 (L1) cache and was connected directly to the DRAM via the NoC and DDR memory controller, each instruction would consume 150 processor clock cycles, resulting in a CPU utilization of only 1/150 = 0.67%.

This is why CPUs, along with some accelerators and other IPs, employ cache memories to increase processor utilization and application performance. The underlying premise upon which the cache concept is based is the principle of locality. The idea is that only a small amount of the main memory is being employed at any given time and that locations in that space are being accessed multiple times. Mainly due to loops, nested loops and subroutines, instructions and their associated data experience temporal, spatial and sequential locality. This means that once a block of instructions and data have been copied from the main memory into an IP’s cache, the IP will typically access them repeatedly.

Today’s high-end CPU IPs usually have a minimum of a Level 1 (L1) and Level 2 (L2) cache, and they often have a Level 3 (L3) cache. Also, some accelerator IPs, like graphics processing units (GPUs) often have their own internal caches. However, these latest-generation high-end IPs often have a 5X to 10X price compared to their previous-generation mid-range counterparts. As a result, as illustrated in figure 1, the CPU in a cost-conscious SoC may come equipped with only an L1 cache.

Let’s consider the CPU and its L1 cache in a little more depth. When the CPU requests something in its cache, the result is called a cache hit. Since the L1 cache typically runs at the same speed as the processor core, a cache hit will be processed in a single processor clock cycle. By comparison, if the requested data is not in the cache, the result, called a cache miss, will require access to the main memory, which will consume 150 processor clock cycles.

Now consider running 1,000,000 instructions. If the cache were large enough to contain the whole program, then this would consume only 1,000,000 clock cycles, resulting in a CPU efficiency of 1,000,000 instructions/1,000,000 clock cycles = 100%.

Unfortunately, the L1 cache in a mid-range CPU will typically be only 16KB to 64KB in size. If we assume a 95% cache hit rate, then 950,000 of our 1,000,000 instructions will take one processor clock cycle. The remaining 50,000 instructions will each consume 150 clock cycles. Thus, the CPU efficiency in this case can be calculated as 1,000,000/((950,000 * 1) + (50,000 * 150)) = ~12%.

Turbocharging performance

A cost-effective way of turbocharging the performance of a cost-conscious SoC is to add cache IPs. For example, CodaCache from Arteris is a configurable, standalone non-coherent cache IP. Each CodaCache instance can be up to 8MB in size, and multiple copies can be instantiated in the same SoC, as demonstrated in figure 2.

Fig. 2: Portion of a turbocharged, non-cache-coherent SoC. (Source: Arteris)

It is not the intention of this article to suggest that every IP should be equipped with a CodaCache. Figure 2 is intended only to provide examples of potential CodaCache deployments.

If a CodaCache instance is associated with an IP, it’s known as a dedicated cache (DC). Alternatively, if a CodaCache instance is associated with a DDR memory controller, it’s referred to as a last-level cache (LLC). A DC will accelerate the performance of the IP with which it is associated, while an LLC will enhance the performance of the entire SoC.

As an example of the type of performance boost we might expect, consider the CPU shown in figure 2. Let’s assume the CodaCache DC instance associated with this IP is running at half the processor speed and that any accesses to this cache consume 20 processor clock cycles. If we also assume a 95% cache hit rate for this DC, then—for 1,000,000 instructions—our overall CPU+L1+DC efficiency can be calculated as 1,000,000/((950,000 * 1) + (47,500 * 20) + (2,500 * 150)) = ~44%. That’s a performance boost of ~273%!

Conclusion

In the past, embedded programmers relished the challenge of squeezing the highest performance possible out of small processors with low clock speeds and limited memory resources. In fact, it was common for computer magazines to issue challenges to their readers along the lines of, “Who can perform task X on processor Y in the minimum number of clock cycles using the smallest amount of memory?”

Today, many SoC developers enjoy the challenge of squeezing the highest performance possible out of their designs, especially if they are constrained to use lower-performing mid-range IPs. Deploying CodaCache IPs as dedicated and last-level caches provides an affordable way for engineers to turbocharge their cost-conscious SoCs. To learn more about CodaCache from Arteris, visit arteris.com.

The post Turbocharging Cost-Conscious SoCs With Cache appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • AI-Powered Data Analytics To Revolutionize The Semiconductor IndustryReela Samuel
    In the age where data reigns supreme, the semiconductor industry stands on the cusp of revolutionary change, redefining complexity and productivity through a lens crafted by artificial intelligence (AI). The intersection of AI and the semiconductor industry is not merely an emerging trend—it is the fulcrum upon which the next generation of technological innovation balances. Semiconductor companies are facing a critical juncture where the burgeoning complexity of chip designs is outpacing the gro
     

AI-Powered Data Analytics To Revolutionize The Semiconductor Industry

30. Květen 2024 v 09:03

In the age where data reigns supreme, the semiconductor industry stands on the cusp of revolutionary change, redefining complexity and productivity through a lens crafted by artificial intelligence (AI). The intersection of AI and the semiconductor industry is not merely an emerging trend—it is the fulcrum upon which the next generation of technological innovation balances. Semiconductor companies are facing a critical juncture where the burgeoning complexity of chip designs is outpacing the growth of skilled human resources. This is where the infusion of AI-powered data analytics catalyzes a seismic shift in the industry’s approach to efficiency and productivity.

AI in semiconductor design: A revolution beckons

With technological leaps like 5G, AI, and autonomous vehicles driving chip demand, the status quo for semiconductor design is no longer sustainable. Traditional design methodologies fall short in addressing the challenges presented by these new technologies, and the need for a new approach is non-negotiable. AI, with its capacity to process massive datasets and learn from patterns, offers a revolutionary solution. Gathered with vast amounts of electronic design automation (EDA) data, machine learning algorithms can pave an efficient path through design complexities.

Navigating the complexity of EDA data with AI

The core of semiconductor design lies in the complexity of EDA data, which is often disparate, unstructured, and immensely intricate, existing in various formats ranging from simple text to sophisticated binary machine-readable data. AI presents a beacon of hope in taming this beast by enabling the industry to store, process, and analyze data with unprecedented efficiency.

AI-enabled data analytics offer a path through the labyrinth of EDA complexity, providing a scalable and sophisticated data storage and processing solution. By harnessing AI’s capabilities, the semiconductor industry can dissect, organize, and distill data into actionable insights, elevating the efficacy of chip design processes.

Leveraging AI in design excellence

Informed decisions are the cornerstone of successful chip design, and the fusion of AI-driven analytics with semiconductor engineering marks a watershed moment in the industry. AI’s ability to comprehend and process unstructured data at scale enables a deeper understanding of design challenges, yielding solutions that optimize SoCs’ power, performance, and area (PPA).

AI models, fed by fragmented data points from EDA compilation, can predict bottlenecks, performance constraints, or power inefficiencies before they impede the design process. This foresight empowers engineers with informed design decisions, fostering an efficient and anticipatory design culture.

Reimagining engineering team efficiency

One of the most significant roadblocks in the semiconductor industry has been aligning designer resources with the exponential growth of chip demand. As designs become complex, they evolve into multifaceted systems on chips (SoCs) housing myriad hierarchical blocks that accumulate vast amounts of data throughout the iterative development cycle. When harnessed effectively, this data possesses untapped potential to elevate the efficiency of engineering teams.

Consolidating data review into a systematic, knowledge-driven process paves the way for accelerated design closure and seamless knowledge transfer between projects. This refined approach can significantly enhance the productivity of engineering teams, a crucial factor if the semiconductor industry is to meet the burgeoning chip demand without exponentially expanding design teams.

Ensuring a systemic AI integration

For the full potential of AI to be realized, a systemic integration across the semiconductor ecosystem is paramount. This integration spans the collection and storage of data and the development of AI models attuned to the industry’s specific needs. Robust AI infrastructure, equipped to handle the diverse data formats, is the cornerstone of this integration. AI models must complement it and be fine-tuned to the peculiarities of semiconductor design, ensuring that the insights they produce are accurate and actionable.

Cultivating AI competencies within engineering teams

As AI plays a central role in the semiconductor industry, it highlights the need for AI competencies within engineering teams. Upskilling the workforce to leverage AI tools and platforms is a critical step toward a harmonized AI ecosystem. This journey toward proficiency entails familiarization with AI concepts and a collaborative approach that blends domain expertise with AI acumen. Engineering teams adept at harnessing AI can unlock its full potential and become pioneers of innovation in the semiconductor landscape.

Intelligent system design

At Cadence, the conception of technological ecosystems is encapsulated within a framework of three concentric circles—a model neatly epitomized by the sophistication of an electric vehicle. The first circle represents the data used by the car; the second circle represents the physical car, including the mechanical, electrical, hardware, and software components. The third circle represents the silicon that powers the entire system.

The Cadence.AI Platform operates at the vanguard of pervasive intelligence, harnessing data and AI-driven analytics to propel system and silicon design to unprecedented levels of excellence. By deploying Cadence.AI, we converge our computational software innovations, from Cadence’s Verisium AI-Driven Verification Platform to the Cadence Cerebrus Intelligent Chip Explorer’s AI-driven implementation.

The AI-driven future of semiconductor innovation

The implications are far-reaching as the semiconductor industry charts its course into an AI-driven era. AI promises to redefine design efficiency, expedite time to market, and pioneer new frontiers in chip innovation. The path forward demands a concerted effort to integrate AI seamlessly into the semiconductor fabric, cultivating an ecosystem primed for the challenges and opportunities ahead.

Semiconductor firms that champion AI adoption will set the standard for the industry’s evolution, carving a niche for themselves as pioneers of a new chip design and production paradigm. The future of semiconductor innovation is undoubtedly AI, and the time to embrace this transformative force is now.

Cadence is already at the forefront of this AI-led revolution. Our Cadence.AI Platform is a testament to AI’s power in redefining systems and silicon design. By enabling the concurrent creation of multiple designs, optimizing team productivity, and pioneering leaner design approaches, Cadence.AI illustrates the true potential of AI in semiconductor innovation.

The harmonized suite of our AI tools equips our customers with the ability to employ AI-driven optimization and debugging, facilitating the concurrent creation of multiple designs while optimizing the productivity of engineering teams. It empowers a leaner workforce to achieve more, elevating their capability to generate a spectrum of designs in parallel with unmatched efficiency and precision, resulting in a new frontier in design excellence, where AI acts as a co-pilot to the engineering team, steering the way to unparalleled chip performance. Learn more about the power of AI to forge intelligent designs.

The post AI-Powered Data Analytics To Revolutionize The Semiconductor Industry appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Design Tool Think Tank RequiredBrian Bailey
    When I was in the EDA industry as a technologist, there were three main parts to my role. The first was to tell customers about new technologies being developed and tool extensions that would be appearing in the next release. These were features they might find beneficial both in the projects they were undertaking today, and even more so, would apply to future projects. Second, I would try and find out what new issues they were finding, or where the tools were not delivering the capabilities the
     

Design Tool Think Tank Required

29. Únor 2024 v 09:10

When I was in the EDA industry as a technologist, there were three main parts to my role. The first was to tell customers about new technologies being developed and tool extensions that would be appearing in the next release. These were features they might find beneficial both in the projects they were undertaking today, and even more so, would apply to future projects. Second, I would try and find out what new issues they were finding, or where the tools were not delivering the capabilities they required. This would feed into tool development planning. And finally, I would take those features selected by the marketing team for implementation and try to work out how best to implement them if it wasn’t obvious to the development teams.

By far the most difficult task out of the three was getting new requirements from customers. Most engineers have their heads down, concentrating on getting their latest chip out. When you ask them about new features, the only thing they offer are their current pain points. These usually involve incremental features or bugs, where the workaround is disliked, or insufficient performance.

Thirty years ago, when I first started doing that role, there were dedicated methodology groups within the larger companies whose job it was to develop flows and methodologies for future projects. This would appear to be the ideal people to ask, but in many cases they were so disconnected from the development team that what they asked for would never actually be used by the development team. These groups were idealists who wanted to instill revolutionary changes, whereas the development teams wanted evolutionary tools. The furthest many of those developments went was pilot projects that never became mainstream.

It seems as if the industry needs a better path to get requirements into the EDA companies. This used to be defined by the ITRS, which would look forward and project the new capabilities that would be required and the timeframes for them. That no longer exists. Today, standards are being driven by semiconductor companies. This is a change from the past, where we used to see the EDA companies driving the developments done within groups like Accellera. When I look at their recent undertakings, most of them are driven by end users.

Getting a standards group started today happens fairly late in the process. It implies an immediate need, but does not really allow time for solutions to be developed ahead of time. It appears that a think tank is required where the industry can discuss issues and problems for which new tool development is required. That can then be built into the EDA roadmaps so that the technology becomes available when it is needed.

One such area is power analysis. I have been writing stories about how important power and energy is becoming and may indeed soon become the limiter for many of the most complex designs. Some of the questions I always ask are:

  • What tools are being developed for doing power analysis of software?
  • How can you calculate the energy consumed for a given function?
  • How can users optimize a design for power or energy?

I rarely get straight answers to any of these questions. Instead, I’m often given vague ideas about how a user could do this in a manual fashion given the tools currently available.

I was beginning to think I was barking up the wrong tree and perhaps these were not legitimate concerns. My sanity was restored by a comment on one of my recent power related stories. Allan Cantle, OCP HPC Sub-Project Leader at Open Compute Project Foundation, wrote: “While it’s great to see articles like this highlight the need for us all to focus on energy centric computing, the sad news is that our tools don’t report energy in any obvious way to show the stupid architectural mistakes we often make from an energy consumption perspective. We are solving all the problems from a bottoms-up perspective by bringing things closer together. While that does bring tremendous energy efficiency benefits, it also creates massively increasing energy density. There is so much low-hanging fruit from a top-down system architecture approach that the industry is missing because we need to think outside the box and across our silos.”

Cantle went on to say: “A trivial improvement in tools that report energy consumption as a first-class metric will make it far easier for us to understand and rectify the mistakes we make as we build new energy-centric, domain-specific computers for each application. Alternatively, the silicon gods that rule our industry would be wise to take a step backward and think about the problem from a systems level perspective.”

I couldn’t agree more, and I find it frustrating that no EDA company seems to be listening. I am sure part of the problem is that the large customers are working on their own internal solutions, and they feel it will provide them with a competitive advantage. Until it becomes clear that all of their competitors have similar solutions, and that they no longer get an advantage from it, then they will look to transfer those solutions to the EDA companies so they do not have to maintain them. The EDA companies will then start to fight to make the solution they have acquired the standard. It all takes a long time.

In partial defense of the EDA companies, they are facing so many new issues these days that they are spread very thin dealing with new nodes, 2.5D, 3D, shift left, multi-physics, AI algorithms – to name just a few. They already spend more on R&D than most technology companies as a percentage of revenue.

Perhaps Accellera could start to include discussion forums in events like DVCon. This would allow for an open discussion about the problems they need to have solved. Perhaps they could start to produce the EDA equivalent of the old ITRS roadmap. It sure would save a lot of time and energy (pun intended).

The post Design Tool Think Tank Required appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Brain-Inspired, Silicon OptimizedBarry Pangrle
    The 2024 International Solid State Circuits Conference was held this week in San Francisco. Submissions were up 40% and contributed to the quality of the papers accepted and the presentations given at the conference. The mood about the future of semiconductor technology was decidedly upbeat with predictions of a $1 trillion industry by 2030 and many expecting that the soaring demand for AI enabling silicon to speed up that timeline. Dr. Kevin Zhang, Senior Vice President, Business Development an
     

Brain-Inspired, Silicon Optimized

29. Únor 2024 v 09:06

The 2024 International Solid State Circuits Conference was held this week in San Francisco. Submissions were up 40% and contributed to the quality of the papers accepted and the presentations given at the conference.

The mood about the future of semiconductor technology was decidedly upbeat with predictions of a $1 trillion industry by 2030 and many expecting that the soaring demand for AI enabling silicon to speed up that timeline.

Dr. Kevin Zhang, Senior Vice President, Business Development and Overseas Operations Office for TSMC, showed the following slide during his opening plenary talk.

Fig. 1: TSMC semiconductor industry revenue forecast to 2030.

The 2030 semiconductor market by platform was broken out as 40% HPC, 30% Mobile, 15% Automotive, 10% IoT and 5% “Others”.

Dr. Zhang also outlined several new generations of transistor technologies, showing that there’s still more improvements to come.

Fig. 2: TSMC transistor architecture projected roadmap.

TSMC’s N2 will be going into production next year and is transitioning TSMC from finFET to nanosheet, and the figure still shows a next step of stacking NMOS and PMOS transistor to get increased density in silicon.

Lip Bu Tan, Chairman, Walden International, also backed up the $1T prediction.

Fig. 3: Walden semiconductor market drivers.

Mr. Tan also referenced an MIT paper from September 2023 titled, “AI Models are devouring energy. Tools to reduce consumption are here, if data centers will adopt.” It states that huge, popular models like ChatGPT signal a trend of large-scale AI, boosting some forecasts that predict data centers could draw up to 21% of the world’s electricity supply by 2030. That’s an astounding over 1/5 of the world’s electricity.

There also appears to be a virtuous cycle of using this new AI technology to create even better computing machines.

Fig. 4: Walden design productivity improvements.

The figure above shows a history of order of magnitude improvements in design productivity to help engineers make use of all the transistors that have been scaling with Moore’s Law. There are also advances in packaging and companies like AMD, Intel and Meta all presented papers of implementations using fine pitch hybrid bonding to build systems with even higher densities. Mr. Tan presented data attributed to market.us predicting that AI will drive a CAGR of 42% in 3D-IC chiplet growth between 2023 and 2033.

Jonah Alben, Senior Vice President of GPU Engineering for NVIDIA, further backed up the claim of generative AI enabling better productivity and better designs. Figure 5 below shows how NVIDIA was able to use their PrefixRL AI system to produce better designs along a whole design curve and stated that this technology was used to design nearly 13,000 circuits in NVIDIA’s Hopper.

There was also a Tuesday night panel session on generative AI for design, and the fairly recent Si Catalyst panel discussion held last November was covered here. This is definitely an area that is growing and gaining momentum.

Fig. 5: NVIDIA example improvements from PrefixRL.

To wrap up, let’s look at some work that’s been reporting best in class performance metrics in terms of efficiency, IBM’s NorthPole. Researchers at IBM published and presented the paper 11.4: “IBM NorthPole: An Architecture for Neural Network Inference with a 12nm Chip.” Last September after HotChips, the article IBM’s Energy-Efficient NorthPole AI Unit included many of the industry competition comparisons, so those won’t be included again here, but we will look at some of the other results that were reported.

The brain-inspired research team has been working for over a decade at IBM. In fact, in October 2014 their earlier spike-based research was reported in the article Brain-Inspired Power. Like many so-called asynchronous approaches, the information and communication overhead for the spikes meant that the energy efficiency didn’t pan out and the team re-thought how to best incorporate brain model concepts into silicon, hence the brain-inspired, silicon optimized tag line.

NorthPole makes use of what IBM refers to as near memory compute. As pointed out and shown here, the memory is tightly integrated with the compute blocks, which reduces how far data must travel and saves energy. As shown in figure 6, for ResNet-50 NorthPole is most efficient running at approximately 680mV and approximately 200MHz (in 12nm FinFET technology). This yields an energy metric of ~1100 frames/joule (equivalently fps/W).

Fig. 6: NorthPole voltage/frequency scaling results for ResNet-50.

To optimize the communication for NorthPole, IBM created 4 NoCs:

  • Partial Sum NoC (PSNoC) communicates within a layer – for spatial computing
  • Activation NoC (ANoC) reorganizes activations between layers
  • Model NoC (MNoC) delivers weights during layer execution
  • Instruction NoC (INoC) delivers the program for each layer prior to layer start

The Instruction and Model NoCs share the same architecture. The protocols are full-custom and optimized for 0 stall cycles and are 2-D meshes. The PSNoC is communicating across short distances and could be said to be NoC-ish. The ANoC is again its own custom protocol implementation. Along with using software to compile executables that are fully deterministic and perform no speculation and optimize the bit width of computations between 8-, 4- and 2-bit calculations, this all leads to a very efficient implementation.

Fig. 7: NorthPole exploded view of PCIe assembly.

IBM had a demonstration of NorthPole running at ISSCC. The unit is well designed for server use and the team is looking forward to the possibility of implementing NorthPole in a more advanced technology node. My thanks to John Arthur from IBM for taking some time to discuss NorthPole.

The post Brain-Inspired, Silicon Optimized appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • What’s Next For Power Electronics? Beyond SiliconEmily Yan
    For more than half a century, silicon has been the bedrock of power electronics. Yet as silicon meets its physical limitations in higher-power, higher-temperature applications, the industry’s relentless pursuit of more efficient power systems has ushered in the wide bandgap (WBG) semiconductors era. The global WBG semiconductors market reached $1.6 billion in 2022, with an estimated CAGR of $13% for the next 8-year period. The adoption of WBG semiconductors, notably silicon carbide (SiC) and gal
     

What’s Next For Power Electronics? Beyond Silicon

Od: Emily Yan
29. Únor 2024 v 09:05

For more than half a century, silicon has been the bedrock of power electronics. Yet as silicon meets its physical limitations in higher-power, higher-temperature applications, the industry’s relentless pursuit of more efficient power systems has ushered in the wide bandgap (WBG) semiconductors era. The global WBG semiconductors market reached $1.6 billion in 2022, with an estimated CAGR of $13% for the next 8-year period. The adoption of WBG semiconductors, notably silicon carbide (SiC) and gallium nitride (GaN), is now setting new benchmarks for performance in power systems across automotive, industrial, and energy sectors. What impact will WBG semiconductors have on power electronics (PE) trends in 2024, and how are they redefining the design and simulation workflows for the next decade?

The catalyst for change: Wide bandgap

The term ‘bandgap’ refers to the energy difference between a material’s insulating and conducting states, a critical factor determining its electrical conductivity.

As shown in figure 1, with its wide bandgap, Gallium Nitride (GaN) exemplifies the three key advantages this property can offer.

Fig. 1: Wide bandgap semiconductor properties.

  • Faster switching speeds: One of the most significant benefits of GaN’s wide bandgap is its contribution to faster switching speeds. The electron mobility in GaN is around 2,000 cm²/Vs, enabling switching frequencies up to 10 times higher than silicon. A higher switching speed translates into reduced switching losses, making the overall designs more compact and efficient.

Fig. 2: Switching speeds of SiC and GaN.

  • Higher thermal resistance: With a thermal conductivity of 2 W/cmK, GaN can dissipate heat and operate at temperatures up to 200°C efficiently. This resilience enables more effective thermal management at high temperatures and extreme conditions.
  • Higher voltages: With an electric breakdown field of 3.3 MV/cm, GaN can withstand almost 10 times silicon’s voltage.

GaN and other wide-bandgap semiconductors offer solutions for high-power, high-frequency, and high-temperature applications with improved energy efficiency and design flexibility.

Emerging challenges in power electronics simulation

The wider adoption of wide bandgap (WBG) semiconductors, including silicon carbide (SiC) and gallium nitride (GaN), sets new standards in the design of Switched Mode Power Supplies (SMPS) – power efficiency, compactness, and lighter weight.

However, the higher switching speeds of GaN and SiC demand more sophisticated design considerations. Power electronics engineers must manage electromagnetic interference (EMI) and optimize thermal performance to ensure reliability and functionality. Layout parasitics, for instance, can lead to voltage spikes in the presence of high di/dt values. Power electronics engineers face the following pressing questions:

  • How can we guarantee reliability for mission-critical applications across a wider range of operating temperatures?
  • What are the essential practices for understanding and predicting EMI and noise?
  • What tools can we employ to create robust thermal models for comprehensive system-level analysis?

To navigate these complexities, engineers need advanced simulation solutions to address layout parasitic effects effectively and come with robust thermal analysis.

Fig. 3: Thermal analysis at board and schematic levels using Keysight’s PEPro.

Key impacts of WBG semiconductors: From EVs to renewable energy

Electric vehicles (EVs): As global EV sales are projected to increase by 21% in 2024, the power efficiency of automotive power electronics is paramount – every additional percentage is a big win. GaN enables more compact and efficient designs of onboard chargers and traction inverters, extending driving ranges by up to 6%.

Data centers: The digital economy’s expansion brings a surge in data center energy consumption, with the U.S. expected to require an additional 39 gigawatts over the next five years—equivalent to powering around 32 million homes. Wide bandgap semiconductors may be key to addressing this challenge by enabling higher server densities and reducing energy consumption and carbon emissions. Specifically, the implementation of GaN transistors in data center infrastructure can lead to a reduction of 100 metric tons of CO2 emissions for every 10 racks annually. This efficiency gain is particularly relevant as the computational and power demands of artificial intelligence (AI) applications soar, potentially tripling the racks’ power density.

Renewable energy: Wide bandgap semiconductors allow for more reliable power output and cost-effective solutions in both residential and commercial renewable energy storage systems. For instance, GaN transistors could achieve four times less power loss than traditional silicon-based power solutions.

The road ahead in the era of WBG semiconductors

GaN and SiC represent a new wave of material innovation to elevate the efficiencies of power electronics and redefine how we power our world. As the applications for WBG semiconductors expand, Keysight empowers our customers with a unified simulation environment to design reliable and long-lasting electronic systems under various operating conditions.

The post What’s Next For Power Electronics? Beyond Silicon appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • NoC Development – Make Or Buy?Frank Schirrmeister
    In the selection and qualification process for semiconductor IP, design teams often consider the cost of in-house development. Network-on-Chip (NoC) IP is no different. In “When Does My SoC Design Need A NoC?” Michael Frank and I argued that most of today’s designs – even less complex ones – can benefit from NoCs. In the blog “Balancing Memory And Coherence: Navigating Modern Chip Architectures,” I discussed the complexity that coherency adds to on-chip interconnect. After I described some of th
     

NoC Development – Make Or Buy?

In the selection and qualification process for semiconductor IP, design teams often consider the cost of in-house development. Network-on-Chip (NoC) IP is no different. In “When Does My SoC Design Need A NoC?” Michael Frank and I argued that most of today’s designs – even less complex ones – can benefit from NoCs. In the blog “Balancing Memory And Coherence: Navigating Modern Chip Architectures,” I discussed the complexity that coherency adds to on-chip interconnect. After I described some of the steps of NoC development based on what ChatGPT 3.5 recommended in “Shortening Network-On-Chip Development Schedules Using Physical Awareness,” it’s time to look at more detail at the development efforts that design teams would have to invest to develop coherent NoCs from scratch.

ChatGPT, here we go again!

The prompt “Tell me how to develop an optimized network-on-chip for semiconductor design, considering the aspects of cache coherency” gives an excellent starting point in ChatGPT 4.0.

Understanding Protocols: First, one needs to understand cache coherency protocols. The recommendation is to study existing protocols before selecting one. Specifically, understand existing cache coherency protocols like MESI (Modified, Exclusive, Shared, Invalid), MOESI (Modified, Owned, Exclusive, Shared, Invalid), and directory-based protocols. Analyze their strengths and weaknesses in terms of scalability, latency, and bandwidth requirements. Then, choose a protocol that aligns with your performance goals and the scale of your NoC. Directory-based protocols are often preferred for larger-scale systems due to their scalability.

ChatGPT’s recommendation for the first step is a good start. I previously discussed the complexity of specific protocols like AMBA AXI, APB, ACE, CHI, OCP, CXL, and TileLink in “Design Complexity In The Golden Age Of Semiconductors.” One must read several thousand pages of documentation to understand the options here. And – by the way – these are orthogonal to the MESI/MOESI commentary from ChatGPT above, as these are implementation choices. In a practical scenario, many of these aspects depend on the building blocks the design team wants to license, like processors from the Arm, RISC-V, Arc, Tensilica, CEVA, and other ecosystems, as well as the protocol support in design IP blocks (think PCIe, UCIe, LPDDR) and accelerators for AI/ML.

NoC Architecture Design: Second, ChatGPT recommends focusing on NoC architecture design. Decide on the NoC topology (e.g., mesh, torus, tree, or custom designs) based on the expected traffic pattern and the scalability requirements. Each topology has its specific advantages, as my colleague Andy Nightingale recently explained here. Furthermore, teams must design efficient routers to handle the chosen cache coherency protocol with minimal latency, implementing features like virtual channels to avoid deadlock and increase throughput. The final part of this step involves optimizing the network for bandwidth and latency by tuning the buffer sizes, employing efficient routing algorithms, and optimizing link widths and speeds.

Cache Coherency Mechanism Integration: Next up, ChatGPT recommends integrating the actual mechanisms of cache coherency. Integrating the cache coherency mechanism with the NoC involves efficient propagation of coherency messages (e.g., invalidate, update) across the network with minimal latency. Designing an efficient directory structure for directory-based protocols that can scale with the network and minimize the coherency traffic requires careful considerations of the placement of directories and the granularity of coherence (e.g., block-level vs. cache-line level).

By the way, for my query, it leaves out the option to handle coherency fully in software.

Simulation and Analysis: At this point, ChatGPT correctly recommends using simulation tools to model your NoC design and evaluate its performance under various workloads. Tools like Gem5, NS-3, or custom simulators can be helpful. I would add SystemC models to the arsenal of tools design teams working on this from scratch could use. Teams need to analyze key performance metrics such as latency, throughput, and energy consumption and pay special attention to the performance of the cache coherency mechanisms.

The last bit is indeed critical as for coherent interconnects, the cost of a cache miss is drastically different from a cache hit.

Optimization and Scaling: This recommendation includes implementing adaptive routing algorithms and dynamic power management techniques to optimize performance and energy efficiency under varying workloads and ensuring the design can scale by adding more cores. This might involve modular design principles or hierarchical NoC structures.

Correct. But, in all practicality, at this point during the project, a lot of time has passed without writing a single line of RTL. Management will ask, “What’s up here?” So, some RTL coding has already happened at this point. Iterations happen fast. Engineers will blame marketing quickly for iterative feature change requests like adding/removing interfaces, changing user bits, Quality of Service (QoS) requirements, address maps, safety needs, buffering, probes, interrupts, modules, etc. All of these can cause significant changes to the RTL. At this point, the consideration has sometimes not started yet that the floorplan can cause more issues because of interface location, blockages, and fences.

Prototyping and Testing: Next, the recommendation is to use FPGA-based prototyping to validate your NoC design in hardware and to test the NoC in the context of the entire system, including the processor cores, memory hierarchy, and peripherals, to identify any issues with the cache coherency mechanism or other components.

True. Emulation and FPGA-based prototyping have become standard steps for most complex designs today. And especially the aspects of cache coherency in the context of the overall system and its software require very long test sequences.

Iterative Design and Feedback: The last recommendation is to use feedback from the simulation, prototyping, and testing phases to refine the NoC design iteratively and benchmark the final design using standard benchmark suites relevant to your target application domain to ensure that it meets the desired performance and efficiency goals.

The cost of “make”

Bar hiring a team of architects with relevant NoC development experience, the first five steps of Understanding Protocols, NoC Architecture Design, Cache Coherency Mechanism Integration, Simulation & Analysis, and Optimization & Scaling will take significant learning time, and writing the implementation specs is far from trivial.

Then, teams will spend most of the effort on RTL development and verification. Just imagine writing RTL protocol adapters for AMBA CHI-E, CHI-B, ACE, ACE-LITE, and AXI – connecting tens of IP blocks coherently – to address coherent and IO coherent use models. Even if you can reuse VIP from EDA vendors to check the protocol correctness, the effort is significant just for unit verification, as you will run thousands of tests.

For the actual interconnect, whether you use a heterogenous, ring, or mesh topology, the effort for development is significant. The logic that deals with directories to enable cache coherency can be complicated. And any change requests require, of course, re-coding!

Finally, when integrating everything in the system context, the effort to validate integration issues, including bring-up in emulation and associate debug, consumes another chunk of effort.

Our customers tell us that when it is all said and done, they would easily spend over 50 person-years just on coherent NoC development for complex designs.

Network-on-chip automation for productivity and configurability.

Automation potential: What to expect from coherent NoC IP

There is a lot of automation potential in the seven steps above!

  • The various relevant protocols can be captured in a library of protocol converters, reducing the need to internalize and implement all the protocols the reused IP blocks speak of. Ideally, they would already be pre-validated with popular IP blocks from leading IP vendors – think providers of Arm and RISC-V ISAs and vendors for interface blocks like LPDDR, PCIe, UCIe, etc., or graphics and AI/ML accelerators.
  • Graphical user interfaces and scripting APIs increase productivity in developing NoC architectures and topologies.
  • Like protocol converters, reusable blocks for directory and cache coherency management can increase development productivity. Their verification is especially critical, so ideally, your vendor has pre-verified them using VIP from EDA vendors and pre-validated the system integration with the ecosystem (think processor providers).
  • The refinement loop is probably the most critical one to optimize. Refinement can be deadly in manual scenarios. Besides reusable building blocks, you should look for configuration tools to automatically create performance models for architectural analysis, export new RTL configurations, and directly connect to digital implementation flows.

The verdict: Make or buy?

The illustration above summarizes some of the automation potential for NoCs. Saving more than 50 person-years is an attractive option compared to developing NoC IP from scratch. Check out what Arteris does in this domain with its Ncore Cache Coherent Interconnect IP and FlexNoC 5 Interconnect IP for non-coherent designs.

The post NoC Development – Make Or Buy? appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Navigating IoT SecurityDana Neustadter
    By Dana Neustadter (Synopsys), Ruud Derwig (Synopsys), and Martin Rösner (G+D) IoT expansion requires secure and efficient connectivity between machines. Integrated SIM technology and remote SIM provisioning can make this possible. Subscriber Identity Module (SIM) cards have been around for a long time, with Giesecke+Devrient (G+D) developing and delivering the first commercial SIM cards in 1991. If you have a cell phone, you will be familiar with these small security anchors that protect phones
     

Navigating IoT Security

29. Únor 2024 v 09:03

By Dana Neustadter (Synopsys), Ruud Derwig (Synopsys), and Martin Rösner (G+D)

IoT expansion requires secure and efficient connectivity between machines. Integrated SIM technology and remote SIM provisioning can make this possible.

Subscriber Identity Module (SIM) cards have been around for a long time, with Giesecke+Devrient (G+D) developing and delivering the first commercial SIM cards in 1991. If you have a cell phone, you will be familiar with these small security anchors that protect phones, networks, and data from fraud and misuse. They also enable phones to securely authenticate and communicate within a mobile network infrastructure managed by carriers. With deployment starting over the last few years, iSIM, also called an integrated Universal Integrated Circuit Card (iUICC), is the newest SIM kid on the block. iSIMs are embedded directly into a system on chip (SoC) as a tamper-resistant secure element, bringing trust and enabling secure connectivity and control, while saving cost and space, simplifying the system development process, and overall offering a significant ease-of-use improvement in the way IoT device connectivity is activated and secured. In addition to the functional advantages, the iSIM also offers sustainability benefits such as reduced CO2 emissions.

Enabling iSIM to support remote SIM provisioning (RSP) calls for an integrated solution that brings together secure services and a secure SIM operating system (OS) with secure hardware. This is the type of challenge that has led to a collaboration between G+D, which continues to lead the way in SIM innovation, and Synopsys, with its expertise in trusted hardware. Putting their heads together, the two companies have come up with an innovative integrated, secure iSIM solution. In a nutshell, Synopsys tRoot Hardware Secure Modules (HSMs) provide silicon-proven, self-contained security IP solutions with root of trust. The HSMs are combined with G+D’s secure SIM OS to enable tamper-resistant elements which are usable within an SoC and serve as an isolated hardware component. G+D’s award-winning RSP services provide seamless management of the SIM profiles. Figure 1 depicts the secure iSIM solution.

“Offering the promise of seamless secure management of SIM profiles, iSIMs help accelerate the broad scaling of the IoT by providing high flexibility to choose the preferred cellular networks throughout the lifetime of devices,” said Andreas Morawietz, global head of Digital Connectivity Portfolio Strategy at G+D. “Our standards-compliant remote SIM provisioning service together with the secure SIM OS integrated with Synopsys’ tRoot Hardware Secure Modules provide an integrated iSIM secure solution at the start of the IoT value chain, delivering benefits to downstream IoT players.”

Fig. 1: Integrated, secure iSIM solution featuring Synopsys and G+D technologies. 

Evolution of SIM technology charts course for the IoT

The breadth of the IoT has become expansive thanks to the prevalence of cellular networks, sensors, cloud computing, AI, and other technologies that enable connectivity and intelligence. Consider, as one example, all the devices and systems that can bring a smart city to life. From traffic signals and streetlights to meters and energy grids, each of these systems must be able to collect and share data that leads to better decision-making and outcomes, as well as more efficient and effective processes. With the integration of AI capabilities, these devices would also be able to act autonomously. SIM technology acts as a trust anchor for secure identification, authentication, and communication. Over the years as new devices enter the IoT realm, users have expectations of increasingly seamless connectivity, simple remote management, and the ability to select their preferred carriers. This has seen removable physical SIMs giving way to embedded SIMs (eSIM) which are soldered onto devices.

As the newest entry in this evolution, iSIMs are anticipated to grow in popularity, answering the call for more optimized, flexible, and secure solutions to allow more things to be connected and controlled. Because it isn’t a disparate chipset, an iSIM provides cost, power, and area efficiency, ideal for small, battery-powered IoT devices, particularly those that operate in low-power wide area networks (LPWANs) through narrowband IoT (NB-IoT) or long-term evolution for machines (LTE-M) technologies. iSIMs also work well in larger industrial systems such as smart meters or even vehicles. In such systems, the SIM technology could be located in hard-to-reach places, making remote management an ideal approach.

While iSIM and remote provisioning are opening up a new ecosystem, these capabilities are only appealing if they’re backed by rock-solid security. Fortunately, there’s quite a lot of technology available to secure network connections and authenticate communicating partners. iSIMs must be developed to offer the same level of security as traditional SIM solutions. For network operators to trust iSIMs, security certification of iSIMs is essential. This is especially important since the network operators don’t control the SIM hardware and software, as it comes with the IoT device and can originate from any number of vendors.

Complete, integrated IoT security solution

The Synopsys and G+D collaboration has been successfully deployed in the field and acknowledged by Tier1 operators for several years. Our efforts bring together complementary technologies that form a complete iSIM security solution for integration into a baseband SoC.

Synopsys’ tRoot HSMs are ideal for SoCs supporting a variety of applications in addition to the IoT, including industrial control, networking, automotive, media, and mobile devices. A hardware root of trust allows chip manufacturers and their OEM customers to create a strong cryptographic device identity for a unique device instance and provides a secure environment for protecting sensitive data and operations. In addition to the secure SIM OS, G+D also provides remote provisioning and device management secure services. The resulting iSIM solution comes with a small footprint and low power consumption and allows for more efficient production, faster time to market and, without extra housing or plastic, greater sustainability. With encrypted loading of chip-unique data (the SIM BLOB, or binary large object), the iSIM can be installed on a chipset without certification of the production facility.

As smart, connected devices become more ubiquitous, ensuring trust that the devices and their data will remain safe from security threats will remain a top priority. Secure remote SIM provisioning helps to streamline device connections and controls in a safe manner. Through our collaboration, Synopsys and G+D are providing mobile network operators and semiconductor manufacturers with a complete security solution for all-in-one connectivity that can nurture the continued expansion of the IoT.

For more information, see Synopsys tRoot Hardware Secure Modules (HSMs).

Ruud Derwig is a principal software engineer for Security IP Solutions at Synopsys.

Martin Rösner is the director for the Digital Connectivity Portfolio Strategy at G+D.

The post Navigating IoT Security appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Weak Verification Plans Lead To Project DisarrayAnika Sunda
    The purpose of the verification plan, or vplan as we call it, is to capture all the verification goals needed to prove that the device works as specified. It’s a big responsibility! Getting it right means having a good blueprint for verification closure. However, getting it wrong could result in bug escapes, wasting of resources, and possibly lead to a device failing altogether. With the focus on AI-driven verification, the efficiency and effectiveness of verification planning are expected to im
     

Weak Verification Plans Lead To Project Disarray

29. Únor 2024 v 09:02

The purpose of the verification plan, or vplan as we call it, is to capture all the verification goals needed to prove that the device works as specified. It’s a big responsibility! Getting it right means having a good blueprint for verification closure. However, getting it wrong could result in bug escapes, wasting of resources, and possibly lead to a device failing altogether. With the focus on AI-driven verification, the efficiency and effectiveness of verification planning are expected to improve significantly.

There are several key elements needed to create a good vPlan. We will go over a few below.

Accurate verification features are needed for verification closure

The concept of divide and conquer suggests that every complex feature can be broken down into sub-features, which in turn can be further divided. Verisium Manager’s Planning Center facilitates this process by enabling users to create expandable/collapsible feature sections, a crucial capability for maintaining quality. Not having this key capability can put quality at risk.

Close alignment to the functional specification

Close adherence to the functional specification is crucially linked to the first point. Any new features or changes to existing ones should prompt immediate updates to the vPlan, as failing to do so could affect verification quality. The Planning Center allows users to associate paragraphs in the specification to the vPlan and provides notifications of any corresponding alterations. This allows users to respond by adjusting the vPlan accordingly in alignment with the specifications.

Connecting relevant metrics, vPlan features

Once the vPlan is defined, it’s important to connect the relevant metrics to demonstrate verification assurance of each feature. It may involve using a combination of code coverage, functional coverage, or directed test to provide that assurance. The Planning Center makes connecting these metrics to the vPlan very straightforward. Failing to link these metrics with the features could result in insufficiently verified features.

Showing real-time results

To effectively monitor progress and respond promptly to areas requiring attention, the vPlan should dynamically reflect the results in real time. This allows for accurate measurement of progress and focused allocation of resources. Delayed results could lead to wasted project time in non-priority areas. Verisium Manager’s vPlan Analysis automates this process enabling users to view real-time vPlan status for relevant regressions.

Customers have shared that vPlan quality significantly influences project outcomes. It’s crucial to prioritize creating higher quality vPlans, rather than simply focusing on speed. However, maintaining consistent high quality can be challenging due to the human tendency to quickly lose interest, with initial strong efforts tapering off as the process continues.

A thorough verification plan is the key to success in ASIC verification. Verification reuse is critical to the productivity and efficiency of system-on-chip (SoC), and a good vPlan is the first step in this direction. If you’re a verification engineer, take the time to develop a thorough verification plan for your next project. It will be one of the best investments you can make in the success of your project.

The post Weak Verification Plans Lead To Project Disarray appeared first on Semiconductor Engineering.

❌
❌