FreshRSS

Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.
PředevčíremHlavní kanál
  • ✇Semiconductor Engineering
  • Chip Industry Week in ReviewThe SE Staff
    Okinawa Institute of Science and Technology proposed a new EUV litho technology using only four reflective mirrors and a new method of illumination optics that it claims will use 1/10 the power and cost half as much as existing EUV technology from ASML. Applied Materials may not receive expected U.S. funding to build a $4 billion research facility in Sunnyvale, CA, due to internal government disagreements over how to fund chip R&D, according to Bloomberg. SEMI published a position paper this
     

Chip Industry Week in Review

2. Srpen 2024 v 09:01

Okinawa Institute of Science and Technology proposed a new EUV litho technology using only four reflective mirrors and a new method of illumination optics that it claims will use 1/10 the power and cost half as much as existing EUV technology from ASML.

Applied Materials may not receive expected U.S. funding to build a $4 billion research facility in Sunnyvale, CA, due to internal government disagreements over how to fund chip R&D, according to Bloomberg.

SEMI published a position paper this week cautioning the European Union against imposing additional export controls to allow companies, encouraging them to  be “as free as possible in their investment decisions to avoid losing their agility and relevance across global markets.” SEMI’s recommendations on outbound investments are in response to the European Economic Security Strategy and emphasize the need for a transparent and predictable regulatory framework.

The U.S. may restrict China’s access to HBM chips and the equipment needed to make them, reports Bloomberg. Today those chips are manufactured by two Korean-based companies, Samsung and SK hynix, but U.S.-based Micron expects to begin shipping 12-high stacks of HBM3E in 2025, and is currently working on HBM4.

Synopsys executive chair and founder Dr. Aart de Geus was named the winner of the Semiconductor Industry Association’s Robert N. Noyce Award. De Geus was selected due to his contributions to EDA technology over a career spanning more than four decades.

The top three foundries plan to implement high-NA EUV lithography as early as 2025 for the 18 angstrom generation, but the replacement of single exposure high-NA (0.55) over double patterning with standard EUV (NA = 0.33) depends on whether it provides better results at a reasonable cost per wafer.

Quick links to more news:

Global
In-Depth
Market Reports and Earnings
Education and Training
Security
Product News
Research
Events and Further Reading


Global

Belgium-based Imec released part 2 of its chiplets series, addressing testing strategies and standardization efforts, as well as guidelines and research “towards efficient ESD protection strategies for advanced 3D systems-on-chip.”

Also in Belgium, BelGan, maker of GaN chips, filed for bankruptcy according to the Brussels Times.

TSMC‘s Dresden, Germany, plant will break ground this month.

The UK will dole out more than £100 million (~US $128 million) in funding to develop five new quantum research hubs in Glasgow, Edinburgh, Birmingham, Oxford, and London.

MassPhoton is opening Hong Kong‘s first ultra-high vacuum GaN epitaxial wafer pilot line and will establish a GaN research center.

Infineon completed the sale of its manufacturing sites in the Philippines and South Korea to ASE.

Israel-based RAAAM Memory Technologies received a €5.25 million grant from the European Innovation Council (EIC) to support the development and commercialization of its innovative memory solutions. This funding will enable RAAAM to advance its research in high-performance and energy-efficient memory technologies, accelerating their integration into various applications and markets.


In-Depth

Semiconductor Engineering published its Automotive, Security and Pervasive Computing newsletter this week, featuring these top stories and video:

And:


Market Reports and Earnings

The semiconductor equipment industry is on a positive trajectory in 2024, with moderate revenue growth observed in Q2 after a subdued Q1, according to a new report from Yole Group. Wafer Fab Equipment revenue is projected to grow by 1.3% year-on-year, despite a 12% drop in Q1. Test equipment lead times are normalizing, improving order conditions. Key areas driving growth include memory and logic capital expenditures and high-bandwidth memory demand.

Worldwide silicon wafer shipments increased by 7% in Q2 2024, according to SEMI‘s latest report. This growth is attributed to robust demand from multiple semiconductor sectors, driven by advancements in AI, 5G, and automotive technologies.

The RF GaN market is projected to grow to US $2 billion by 2029, a 10% CAGR, according to Yole Group.

Counterpoint released their Q2 smartphone top 10 report.

Renesas completed their acquisition of EDA firm Altium, best known for its EDA platform and freeware CircuitMaker package.

It’s earnings season and here are recently released financials in the chip industry:

AMD  Advantest   Amkor   Ansys  Arteris   Arm   ASE   ASM   ASML
Cadence  IBM   Intel   Lam Research   Lattice   Nordson   NXP   Onsemi 
Qualcomm   Rambus  Samsung    SK Hynix   STMicro   Teradyne    TI  
Tower  TSMC    UMC  Western Digital

Industry stock price impacts are here.


Education and Training

Rochester Institute of Technology is leading a new pilot program to prepare community college students in areas such as cleanroom operations, new materials, simulation, and testing processes, with the intent of eventual transfer into RIT’s microelectronic engineering program.

Purdue University inked a deal with three research institutions — University of Piraeus, Technical University of Crete, and King’s College London —to develop joint research programs for semiconductors, AI and other critical technology fields.

The European Chips Skills Academy formed the Educational Leaders Board to help bridge the talent gap in Europe’s microelectronics sector.  The Board includes representatives from universities, vocational training providers, educators and research institutions who collaborate on strategic initiatives to strengthen university networks and build academic expertise through ECSA training programs.


Security

The Cybersecurity and Infrastructure Security Agency (CISA) is encouraging Apple users to review and apply this week’s recent security updates.

Microsoft Azure experienced a nearly 10 hour DDoS attack this week, leading to global service disruption for many customers.  “While the initial trigger event was a Distributed Denial-of-Service (DDoS) attack, which activated our DDoS protection mechanisms, initial investigations suggest that an error in the implementation of our defenses amplified the impact of the attack rather than mitigating it,” stated Microsoft in a release.

NIST published:

  • “Recommendations For Increasing U.S. Participation and Leadership in Standards Development,” a report outlining cybersecurity recommendations and mitigation strategies.
  • Final guidance documents and software to help improve the “safety, security and trustworthiness of AI systems.”
  • Cloud Computing Forensic Reference Architecture guide.

Delta Air Lines plans to seek damages after losing $500 million in lost revenue due to security company CrowdStrike‘s software update debacle.  And shareholders are also angry.

Recent security research:

  • Physically Secure Logic Locking With Nanomagnet Logic (UT Dallas)
  • WBP: Training-time Backdoor Attacks through HW-based Weight Bit Poisoning (UCF)
  • S-Tune: SOT-MTJ Manufacturing Parameters Tuning for Secure Next Generation of Computing ( U. of Arizona, UCF)
  • Diffie Hellman Picture Show: Key Exchange Stories from Commercial VoWiFi Deployments (CISPA, SBA Research, U. of Vienna)

Product News

Lam Research introduced a new version of its cryogenic etch technology designed to enhance the manufacturing of 3D NAND for AI applications. This technology allows for the precise etching of high aspect ratio features, crucial for creating 1,000-layer 3D NAND.


Fig.1: 3D NAND etch. Source: Lam Research

Alphawave Semi launched its Universal Chiplet Interconnect Express Die-to-Die IP. The subsystem offers 8 Tbps/mm bandwidth density and supports operation at 24 Gbps for D2D connectivity.

Infineon introduced a new MCU series for industrial and consumer motor controls, as well as power conversion system applications. The company also unveiled its new GoolGaN Drive product family of integrated single switches and half-bridges with integrated drivers.

Rambus released its DDR5 Client Clock Driver for next-gen, high-performance desktops and notebooks. The chips include Gen1 to Gen4 RCDs, power management ICs, Serial Presence Detect Hubs, and temperature sensors for leading-edge servers.

SK hynix introduced its new GDDR7 graphics DRAM. The product has an operating speed of 32Gbps, can process 1.5TB of data per second and has a 50% power efficiency improvement compared to the previous generation.

Intel launched its new Lunar Lake Ultra processors. The long awaited chips will be included in more than 80 laptop designs and has more than 40 NPU tera operations per second as well as over 60 GPU TOPS delivering more than 100 platform TOPS.

Brewer Science achieved recertification as a Certified B Corporation, reaffirming its commitment to sustainable and ethical business practices.

Panasonic adopted Siemens’ Teamcenter X cloud product lifecycle management solution, citing Teamcenter X’s Mendix low-code platform, improved operational efficiency and flexibility for its choice.

Keysight validated its 5G NR FR1 1024-QAM demodulation test cases for the first time. The 5G NR radio access technology supports eMBB and was validated on the 3GPP TS 38.521-4 test specification.


Research

In a 47-page deep-dive report, the Center for Security and Emerging Technology delved into all of the scientific breakthroughs from 1980 to present that brought EUV lithography to commercialization, including lessons learned for the next emerging technologies.

Researchers at the Paul Scherrer Institute developed a high-performance X-ray tomography technique using burst ptychography, achieving a resolution of 4nm. This method allows for non-destructive imaging of integrated circuits, providing detailed views of nanostructures in materials like silicon and metals.

MIT signed a four-year agreement with the Novo Nordisk Foundation Quantum Computing Programme at University of Copenhagen, focused on accelerating quantum computing hardware research.

MIT’s Research Laboratory of Electronics (RLE) developed a mechanically flexible wafer-scale integrated photonics fabrication platform. This enables the creation of flexible photonic circuits that maintain high performance while being bendable and stretchable. It offers significant potential for integrating photonic circuits into various flexible substrate applications in wearable technology, medical devices, and flexible electronics.

The Naval Research Lab identified a new class of semiconductor nanocrystals with bright ground-state excitons, emphasizing an important advancement in optoelectronics.

Researchers from National University of Singapore developed a novel method, known as tension-driven CHARM3D,  to fabricate 3D self-healing circuits, enabling the 3D printing of free-standing metallic structures without the need for support materials and external pressure.

Find more research in our Technical Papers library.


Events and Further Reading

Find upcoming chip industry events here, including:

Event Date Location
Atomic Layer Deposition (ALD 2024) Aug 4 – 7 Helsinki
Flash Memory Summit Aug 6 – 8 Santa Clara, CA
USENIX Security Symposium Aug 14 – 16 Philadelphia, PA
SPIE Optics + Photonics 2024 Aug 18 – 22 San Diego, CA
Cadence Cloud Tech Day Aug 20 San Jose, CA
Hot Chips 2024 Aug 25- 27 Stanford University/ Hybrid
Optica Online Industry Meeting: PIC Manufacturing, Packaging and Testing (imec) Aug 27 Online
SEMICON Taiwan Sep 4 -6 Taipei
DVCON Taiwan Sep 10 – 11 Hsinchu
AI HW and Edge AI Summit Sep 9 – 12 San Jose, CA
GSA Executive Forum Sep 26 Menlo Park, CA
SPIE Photomask Technology + EUVL Sep 29 – Oct 3 Monterey, CA
Strategic Materials Conference: SMC 2024 Sep 30 – Oct 2 San Jose, CA
Find All Upcoming Events Here

Upcoming webinars are here, including topics such as quantum safe cryptography, analytics for high-volume manufacturing, and mastering EMC simulations for electronic design.

Find Semiconductor Engineering’s latest newsletters here:

Automotive, Security and Pervasive Computing
Systems and Design
Low Power-High Performance
Test, Measurement and Analytics
Manufacturing, Packaging and Materials

 

The post Chip Industry Week in Review appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Chip Industry Week in ReviewThe SE Staff
    Okinawa Institute of Science and Technology proposed a new EUV litho technology using only four reflective mirrors and a new method of illumination optics that it claims will use 1/10 the power and cost half as much as existing EUV technology from ASML. Applied Materials may not receive expected U.S. funding to build a $4 billion research facility in Sunnyvale, CA, due to internal government disagreements over how to fund chip R&D, according to Bloomberg. SEMI published a position paper this
     

Chip Industry Week in Review

2. Srpen 2024 v 09:01

Okinawa Institute of Science and Technology proposed a new EUV litho technology using only four reflective mirrors and a new method of illumination optics that it claims will use 1/10 the power and cost half as much as existing EUV technology from ASML.

Applied Materials may not receive expected U.S. funding to build a $4 billion research facility in Sunnyvale, CA, due to internal government disagreements over how to fund chip R&D, according to Bloomberg.

SEMI published a position paper this week cautioning the European Union against imposing additional export controls to allow companies, encouraging them to  be “as free as possible in their investment decisions to avoid losing their agility and relevance across global markets.” SEMI’s recommendations on outbound investments are in response to the European Economic Security Strategy and emphasize the need for a transparent and predictable regulatory framework.

The U.S. may restrict China’s access to HBM chips and the equipment needed to make them, reports Bloomberg. Today those chips are manufactured by two Korean-based companies, Samsung and SK hynix, but U.S.-based Micron expects to begin shipping 12-high stacks of HBM3E in 2025, and is currently working on HBM4.

Synopsys executive chair and founder Dr. Aart de Geus was named the winner of the Semiconductor Industry Association’s Robert N. Noyce Award. De Geus was selected due to his contributions to EDA technology over a career spanning more than four decades.

The top three foundries plan to implement high-NA EUV lithography as early as 2025 for the 18 angstrom generation, but the replacement of single exposure high-NA (0.55) over double patterning with standard EUV (NA = 0.33) depends on whether it provides better results at a reasonable cost per wafer.

Quick links to more news:

Global
In-Depth
Market Reports and Earnings
Education and Training
Security
Product News
Research
Events and Further Reading


Global

Belgium-based Imec released part 2 of its chiplets series, addressing testing strategies and standardization efforts, as well as guidelines and research “towards efficient ESD protection strategies for advanced 3D systems-on-chip.”

Also in Belgium, BelGan, maker of GaN chips, filed for bankruptcy according to the Brussels Times.

TSMC‘s Dresden, Germany, plant will break ground this month.

The UK will dole out more than £100 million (~US $128 million) in funding to develop five new quantum research hubs in Glasgow, Edinburgh, Birmingham, Oxford, and London.

MassPhoton is opening Hong Kong‘s first ultra-high vacuum GaN epitaxial wafer pilot line and will establish a GaN research center.

Infineon completed the sale of its manufacturing sites in the Philippines and South Korea to ASE.

Israel-based RAAAM Memory Technologies received a €5.25 million grant from the European Innovation Council (EIC) to support the development and commercialization of its innovative memory solutions. This funding will enable RAAAM to advance its research in high-performance and energy-efficient memory technologies, accelerating their integration into various applications and markets.


In-Depth

Semiconductor Engineering published its Automotive, Security and Pervasive Computing newsletter this week, featuring these top stories and video:

And:


Market Reports and Earnings

The semiconductor equipment industry is on a positive trajectory in 2024, with moderate revenue growth observed in Q2 after a subdued Q1, according to a new report from Yole Group. Wafer Fab Equipment revenue is projected to grow by 1.3% year-on-year, despite a 12% drop in Q1. Test equipment lead times are normalizing, improving order conditions. Key areas driving growth include memory and logic capital expenditures and high-bandwidth memory demand.

Worldwide silicon wafer shipments increased by 7% in Q2 2024, according to SEMI‘s latest report. This growth is attributed to robust demand from multiple semiconductor sectors, driven by advancements in AI, 5G, and automotive technologies.

The RF GaN market is projected to grow to US $2 billion by 2029, a 10% CAGR, according to Yole Group.

Counterpoint released their Q2 smartphone top 10 report.

Renesas completed their acquisition of EDA firm Altium, best known for its EDA platform and freeware CircuitMaker package.

It’s earnings season and here are recently released financials in the chip industry:

AMD  Advantest   Amkor   Ansys  Arteris   Arm   ASE   ASM   ASML
Cadence  IBM   Intel   Lam Research   Lattice   Nordson   NXP   Onsemi 
Qualcomm   Rambus  Samsung    SK Hynix   STMicro   Teradyne    TI  
Tower  TSMC    UMC  Western Digital

Industry stock price impacts are here.


Education and Training

Rochester Institute of Technology is leading a new pilot program to prepare community college students in areas such as cleanroom operations, new materials, simulation, and testing processes, with the intent of eventual transfer into RIT’s microelectronic engineering program.

Purdue University inked a deal with three research institutions — University of Piraeus, Technical University of Crete, and King’s College London —to develop joint research programs for semiconductors, AI and other critical technology fields.

The European Chips Skills Academy formed the Educational Leaders Board to help bridge the talent gap in Europe’s microelectronics sector.  The Board includes representatives from universities, vocational training providers, educators and research institutions who collaborate on strategic initiatives to strengthen university networks and build academic expertise through ECSA training programs.


Security

The Cybersecurity and Infrastructure Security Agency (CISA) is encouraging Apple users to review and apply this week’s recent security updates.

Microsoft Azure experienced a nearly 10 hour DDoS attack this week, leading to global service disruption for many customers.  “While the initial trigger event was a Distributed Denial-of-Service (DDoS) attack, which activated our DDoS protection mechanisms, initial investigations suggest that an error in the implementation of our defenses amplified the impact of the attack rather than mitigating it,” stated Microsoft in a release.

NIST published:

  • “Recommendations For Increasing U.S. Participation and Leadership in Standards Development,” a report outlining cybersecurity recommendations and mitigation strategies.
  • Final guidance documents and software to help improve the “safety, security and trustworthiness of AI systems.”
  • Cloud Computing Forensic Reference Architecture guide.

Delta Air Lines plans to seek damages after losing $500 million in lost revenue due to security company CrowdStrike‘s software update debacle.  And shareholders are also angry.

Recent security research:

  • Physically Secure Logic Locking With Nanomagnet Logic (UT Dallas)
  • WBP: Training-time Backdoor Attacks through HW-based Weight Bit Poisoning (UCF)
  • S-Tune: SOT-MTJ Manufacturing Parameters Tuning for Secure Next Generation of Computing ( U. of Arizona, UCF)
  • Diffie Hellman Picture Show: Key Exchange Stories from Commercial VoWiFi Deployments (CISPA, SBA Research, U. of Vienna)

Product News

Lam Research introduced a new version of its cryogenic etch technology designed to enhance the manufacturing of 3D NAND for AI applications. This technology allows for the precise etching of high aspect ratio features, crucial for creating 1,000-layer 3D NAND.


Fig.1: 3D NAND etch. Source: Lam Research

Alphawave Semi launched its Universal Chiplet Interconnect Express Die-toDie IP. The subsystem offers 8 Tbps/mm bandwidth density and supports operation at 24 Gbps for D2D connectivity.

Infineon introduced a new MCU series for industrial and consumer motor controls, as well as power conversion system applications. The company also unveiled its new GoolGaN Drive product family of integrated single switches and half-bridges with integrated drivers.

Rambus released its DDR5 Client Clock Driver for next-gen, high-performance desktops and notebooks. The chips include Gen1 to Gen4 RCDs, power management ICs, Serial Presence Detect Hubs, and temperature sensors for leading-edge servers.

SK hynix introduced its new GDDR7 graphics DRAM. The product has an operating speed of 32Gbps, can process 1.5TB of data per second and has a 50% power efficiency improvement compared to the previous generation.

Intel launched its new Lunar Lake Ultra processors. The long awaited chips will be included in more than 80 laptop designs and has more than 40 NPU tera operations per second as well as over 60 GPU TOPS delivering more than 100 platform TOPS.

Brewer Science achieved recertification as a Certified B Corporation, reaffirming its commitment to sustainable and ethical business practices.

Panasonic adopted Siemens’ Teamcenter X cloud product lifecycle management solution, citing Teamcenter X’s Mendix low-code platform, improved operational efficiency and flexibility for its choice.

Keysight validated its 5G NR FR1 1024-QAM demodulation test cases for the first time. The 5G NR radio access technology supports eMBB and was validated on the 3GPP TS 38.521-4 test specification.


Research

In a 47-page deep-dive report, the Center for Security and Emerging Technology delved into all of the scientific breakthroughs from 1980 to present that brought EUV lithography to commercialization, including lessons learned for the next emerging technologies.

Researchers at the Paul Scherrer Institute developed a high-performance X-ray tomography technique using burst ptychography, achieving a resolution of 4nm. This method allows for non-destructive imaging of integrated circuits, providing detailed views of nanostructures in materials like silicon and metals.

MIT signed a four-year agreement with the Novo Nordisk Foundation Quantum Computing Programme at University of Copenhagen, focused on accelerating quantum computing hardware research.

MIT’s Research Laboratory of Electronics (RLE) developed a mechanically flexible wafer-scale integrated photonics fabrication platform. This enables the creation of flexible photonic circuits that maintain high performance while being bendable and stretchable. It offers significant potential for integrating photonic circuits into various flexible substrate applications in wearable technology, medical devices, and flexible electronics.

The Naval Research Lab identified a new class of semiconductor nanocrystals with bright ground-state excitons, emphasizing an important advancement in optoelectronics.

Researchers from National University of Singapore developed a novel method, known as tension-driven CHARM3D,  to fabricate 3D self-healing circuits, enabling the 3D printing of free-standing metallic structures without the need for support materials and external pressure.

Find more research in our Technical Papers library.


Events and Further Reading

Find upcoming chip industry events here, including:

Event Date Location
Atomic Layer Deposition (ALD 2024) Aug 4 – 7 Helsinki
Flash Memory Summit Aug 6 – 8 Santa Clara, CA
USENIX Security Symposium Aug 14 – 16 Philadelphia, PA
SPIE Optics + Photonics 2024 Aug 18 – 22 San Diego, CA
Cadence Cloud Tech Day Aug 20 San Jose, CA
Hot Chips 2024 Aug 25- 27 Stanford University/ Hybrid
Optica Online Industry Meeting: PIC Manufacturing, Packaging and Testing (imec) Aug 27 Online
SEMICON Taiwan Sep 4 -6 Taipei
DVCON Taiwan Sep 10 – 11 Hsinchu
AI HW and Edge AI Summit Sep 9 – 12 San Jose, CA
GSA Executive Forum Sep 26 Menlo Park, CA
SPIE Photomask Technology + EUVL Sep 29 – Oct 3 Monterey, CA
Strategic Materials Conference: SMC 2024 Sep 30 – Oct 2 San Jose, CA
Find All Upcoming Events Here

Upcoming webinars are here, including topics such as quantum safe cryptography, analytics for high-volume manufacturing, and mastering EMC simulations for electronic design.

Find Semiconductor Engineering’s latest newsletters here:

Automotive, Security and Pervasive Computing
Systems and Design
Low Power-High Performance
Test, Measurement and Analytics
Manufacturing, Packaging and Materials

 

The post Chip Industry Week in Review appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • RISC-V Heralds New Era Of CooperationBrian Bailey
    RISC-V is paving the way for open source to become accepted within the hardware community, creating a level of industry collaboration never seen in the past, while revitalizing the connection between academia and industry. The big question is whether this arrangement is just a placeholder while the industry re-learns how to develop processors, or whether this processor architecture is something very different. In either case, there is a clear and pressing need for more flexible processor archite
     

RISC-V Heralds New Era Of Cooperation

30. Květen 2024 v 09:05

RISC-V is paving the way for open source to become accepted within the hardware community, creating a level of industry collaboration never seen in the past, while revitalizing the connection between academia and industry.

The big question is whether this arrangement is just a placeholder while the industry re-learns how to develop processors, or whether this processor architecture is something very different. In either case, there is a clear and pressing need for more flexible processor architectures, and at least for now, RISC-V has filled a void.

“RISC-V was born out of academia and has had strong collaboration within universities from day one,” says Loren Hobbs, vice president of product and business development at Bluespec. “This collaboration continues today, with many of the most popular open-source RISC-V processors having come from universities. Organizations such as OpenHW Group and CHIPS Alliance serve a central and critical role in driving the collaboration, which is bi-directional between the academic community and industry.”

Collaboration of this type has not existed with the industrial community in the past. “We are learning from each other,” says Florian Wohlrab, CEO at OpenHW. “We are learning best practices for verification. At the same time, we are learning what things to avoid. It is growing where people say, ‘Yes, I really get benefit from sharing ideas.'”

The need for processor flexibility exists within industry as well as academia. “There is a need within the industry for diversification on the processor front,” says Neil Hand, director of marketing at Siemens EDA. “In the past, this led to a fragmented set of companies that couldn’t work together. They didn’t see the value of working together. But RISC-V has a cohesive central organization where anyone who wants to get into the processor space can collaborate. They don’t have to expose their secret sauce, but they get to benefit from each other. A rising tide lifts all boats, and that’s really the situation we’re at with RISC-V.”

Longevity
Whether the industry can build upon this success, or whether it fizzles out over time, remains to be seen. But at least for now, RISC-V’s momentum is growing. “We are at the beginning of a revolution in hardware design,” says OpenHW’s Wohlrab. “We saw the same thing for software when Linux came out 20 or so years ago. No one was really thinking about sharing software or collaboratively developing software. There were some small open-source ventures, but working together on a big project took a long time to develop. Now we are all sharing software, all co-working. But for hardware, we’re just at the beginning of this new concept, and a lot of people need to understand that we can do the same for hardware as we did for software.”

Underlying RISC-V’s success is widespread collaboration. “One of the pillars sustaining the success of RISC-V is customization that works with the ecosystem and leverages a well-defined process,” says Sergio Marchese, vice president of application engineering at SmartDV. “RISC-V vendors face the challenge of showing how their processor customization capabilities serve an application and demonstrating the complete process on real hardware. Without strategic partnerships, RISC-V vendors must walk a much more challenging, time-consuming, and resource-intensive road.”

That framework is what makes it unique. “RISC-V has formed this framework for collaboration, and it fixes everything,” says Siemens’ Hand. “Now, when a university has a really cool idea for memory tagging in a processor design, they don’t have to build the compilers, they don’t have to build the reference platform. They already exist. Maybe a compiler optimization startup has this great idea for handling code optimization. They don’t have to build the rest of the ecosystem. When a processor IP company has this great idea, they can become focused within this bigger picture. That’s the unique nature of it. It’s not just a processor specification.”

Historically, one of the problems associated with open-source hardware was quality, because finding bugs in silicon is expensive. OpenHW is an important piece of the puzzle. “Why should everyone reinvent the wheel by themselves?” asks Wohlrab. “Why can’t we get the basic building blocks, some basic chips, take some design from academia, which has reasonably good quality, and build on them, verify them together. We are verifying with different tools, making sure we get a high coverage, and then everyone can go off and use them in their own chips for mass production, for volume shipment.”

This benefits companies both large and small. “There are several processor vendors that have switched to RISC-V,” says Hand. “Synopsys has moved to RISC-V. Andes has moved to RISC-V. MIPS has moved to RISC-V. Why? Because they can leverage the whole ecosystem. The downside of it is commoditization, which as a customer is really beneficial because you can delay choosing a processor till later in the design flow. Your early decision is to use the Arm ecosystem or RISC-V, and then you can work through it. That creates an interesting set of dynamics. You can start to create new opportunities for companies that develop and deliver IP, because you can benchmark them, swap them in and out, and see which one works. On the flip side, it makes it awful from a lock-in perspective once you’re in that socket.”

Fragmentation
Of course, there will be some friction in the system. “In the early days of RISC-V there was nearly a 1:1 balance between contributors and consumers of the technology,” says Geir Eide, director, product management for Siemens EDA. “Today there are thousands of RISC-V consumers, but only a small percentage of those will be contributors. There is a risk that there will be a disconnect between them. If, for instance, a particular market or regional segment is growing at a higher pace than others, or other market segments and regions are more conservative, they tend to stick to established solutions longer. That increases the risk that it could lead to fragmentation.”

Is that likely to impact development long term? “We do not believe that RISC-V will become regionally concentrated, although though there may be regional concentrations of focus within the broad set of implementation choices provided by RISC-V,” says Bluespec’s Hobbs. “A prime example of this is the Barcelona Supercomputer Center, creating a regional focus area for high-performance computing using RISC-V. However, while there may be regional focus areas, this does not mean that the RISC-V standard is, or will become, fragmented. In fact, one of the key tenets of the creation and foundation of RISC-V was preventing fragmentation of the ISA, and it continues to be a key function of RISC-V international.”

China may be a different story. “A lot of companies in China are creating RISC-V cores for internal consumption — for political reasons mostly,” says John Min, vice president of customer service at Arteris. “I think China will go 100% RISC-V for embedded, but it’s a one-way street. They will keep leveraging what the Western companies do and enhance it. China will continue sucking all advancements, such as vectorization, or the special domain-specific acceleration enhancements. They will create their own and make it their own internally, but they will give nothing back.”

Such splits have occurred in the past. “Design languages are the most recent example of that,” says Hand. “There was a regional split, and you had Europe focus on VHDL while America went with Verilog. With RISC-V, there will be that regional split where people will go off and do their things regionally. Europe has focused projects, India has theirs, but they’re still doing it within this framework. It’s this realization that everyone benefits. They’re not doing it to benefit the other people. They’re doing it ultimately to save themselves effort, to save themselves cost, but they realize that by doing it in that framework it is a net benefit to everyone.”

Bi-directionality
An important element is that everyone benefits, and that has to stretch across the academic/commercial boundary. “RISC-V has propelled a new degree of collaboration between academia and commercial organizations,” says Dave Kelf, CEO at Breker. “It’s noticeable that institutions such as Harvey Mudd College in Claremont, California, and ETH in Zurich, Switzerland, to name two, have produced advanced processor designs as a teaching aid, and have collaborated with multiple companies on their verification and design. This has been further advanced by OpenHW Group, which has made these designs accessible to the industry. This bi-directional collaboration benefits the tool providers to further enhance their offerings working on advanced, open devices, while also enabling academia to improve their designs to a commercial quality level. The virtuous circle created is essential if we are to see RISC-V established as a mainstream, industry-wide capability.”

Academia has a lot to offer in hardware advancement. “Researchers in universities are developing innovative new software and hardware to push the limits of RISC-V innovation,” says Dave Miller, head of corporate communications at SiFive. “Many of the RISC-V projects in academia are focused on optimizing performance and energy efficiency for AI workloads, and are open source so the entire ecosystem can benefit. Researchers are also actively contributing to RISC-V working groups to share their knowledge and collaborate with industry participants. These working groups are split evenly between representatives from APAC, Europe, and North America, all working together towards common goals.”

In many cases, industry is willing to fund such projects. “It makes it easy to have research topics that don’t need to boil the ocean,” says Hand. “If you’re a PhD student and you have a great idea, you can go do it. It’s easy for an industry partner to say, ‘I’ll sponsor that. That’s an interesting thing, and I am not required to allocate ridiculous amounts of money into an open-ended project. It’s like I can see the connection of how that research will go into a commercial product later.'”

This feeds back into academia. “The academics have been jumping on board with OpenHW,” says Wohlrab. “By taking their cores and productizing them, they get a chip back that could be shipped in high volume. Then they can do their research on a real commercial product and can see if their idea would fly in real life. They get real numbers and can see real figures for the benefits of a new branch predictor.”

It can also have a long-term benefit for tools. “There are areas where they want to collaborate with us, especially around security,” says Kiran Vittal, executive director for alliances marketing management at Synopsys. “They are building RISC-V based sub-systems using open-source RISC-V processors, and then academia wants to look at not only the AI part, but the security part. There are post-doc students or PhD students looking into using our tools to verify or to implement whatever they’re doing on security.”

That provides an incentive for EDA to offer better tools for use in universities. “Although there has always been collaboration between universities and the industry, where industry provides the universities with access to EDA tools, IP cores, etc., there’s often a bit of a lag,” says Siemens’ Eide. “In many situations (especially outside of the core area of a particular project), universities have access to older versions of the commercial solutions. If you for instance look at a new grad’s resume, where you in the past would see references to old tech, now you see a lot of references to relatively sophisticated use of RISC-V.”

Moving Forward
This collaboration needs to keep pushing forward. “We had an initiative to create a standardized interface for accelerators,” says Wohlrab. “RISC-V International standardized how to add custom instructions in the ISA, but there was no standard for the hardware interface. So we built this. It was a cool discussion. There were people from Silicon Labs, people from NXP, people from Thales, plus several startups. They all came together and asked, ‘How can we make it future proof and put the accelerators inside?'”

The application space for RISC-V is changing. “The big inflection point is Linux and Android,” says Arteris’ Min. “Android already has some support, but when both Android and Linux are really supported, it will change the mobile apps processor game. The number of designs will proliferate. The number of high-end designs will explode. It will take the whole industry to enable that because RISC-V companies are not big enough to create this by themselves. All the RISC-V companies are partners, because we enable this high-end design at the processor levels.”

That would deepen the software community’s engagement. “An embedded software developer needs to understand the underlying hardware if they want to run Linux on a RISC-V processor that uses custom instructions/ accelerators,” says Bluespec’s Hobbs. “To develop complex embedded hardware/software systems, both embedded software developers and embedded hardware developments must possess contextual understanding of the interoperability of hardware and software. The developer must understand how the customized processor is leveraging the custom instructions in hardware for Linux to efficiently manage and execute the accelerated workloads.”

This collaboration could reinvigorate research into EDA, as well. “With AI you can build predictive models,” says Hand. “Could that be used to identify the change effects from making an extension? What does that mean? There’s a cloud of influence — not directly gate-wise, because that immediately explodes — but perhaps based on test suites. ‘I know that something that touches that logic touches this downstream, which touches the rest of the design.’ That’s where AI plays a big role, and it is one of the interesting areas because in verification there are so many unknowns. When AI comes along, any guidance or any visibility that you can give is incredibly powerful. Even if it is not right 100% of the time, that’s okay, as long as it generates false negatives and not false positives.”

There is a great opportunity for EDA companies. “We collaborate with many of the open-source providers, with OpenHW group, with ETH in Zurich,” says Synopsys’ Vittal. “We want to promote our solutions when it comes to any processor design and you need standard tools like synthesis, place and route, simulation. But then there are also other kinds of unique solutions because RISC-V is so customizable, you can build your own custom instructions. You need something specific to verify these custom instructions and that’s why the Imperas golden models are important. We also collaborated with Bluespec to develop a verification methodology to take you through functional verification and debug.”

There are still some wrinkles to be worked out for customizations. “RISC-V gives us predictability,” says Hand. “We can create a compliance test suite, give you a processor optimization package if you’re on the implementation side. We can create analytics and testing solutions because we know what it’s going to look like. But for non-standard processors, it is effectively as a service, because everyone’s processor is a little bit different. The reason you see a lot of focus on the verification, from platform architecture exploration all the way through, is because if you change one little thing, such as an addressing mode, it impacts pretty much 100% of your processor verification. You’ve got to retest the whole processor. Most people aren’t set up like an Arm or an Intel with huge processor verification teams and the infrastructure, and so they need automation to do it for them.”

Conclusion
RISC-V has enabled the industry to create a framework for collaboration, which enables everyone to work together for selfish reasons. It is a symbiotic relationship that continues to build, and it is creating a wider sphere of influence over time.

“It’s unique in the modern era of semiconductor,” says Hand. “You have such a wide degree of collaboration, where you have processor manufacturers, the software industry leaders, EDA companies, all working on a common infrastructure.”

Related Reading
RISC-V Micro-Architectural Verification
Verifying a processor is much more than making sure the instructions work, but the industry is building from a limited knowledge base and few dedicated tools.
RISC-V Wants All Your Cores
It is not enough to want to dominate the world of CPUs. RISC-V has every core in its sights, and it’s starting to take steps to get there.

The post RISC-V Heralds New Era Of Cooperation appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Turbocharging Cost-Conscious SoCs With CacheJohn Min
    Some design teams creating system-on-chip (SoC) devices are fortunate to work with the latest and greatest technology nodes coupled with a largely unconstrained budget for acquiring intellectual property (IP) blocks from trusted third-party vendors. However, many engineers are not so privileged. For every “spare no expense” project, there are a thousand “do the best you can with a limited budget” counterparts. One way to squeeze the most performance out of lower-cost, earlier generation, mid-ran
     

Turbocharging Cost-Conscious SoCs With Cache

Od: John Min
30. Květen 2024 v 09:04

Some design teams creating system-on-chip (SoC) devices are fortunate to work with the latest and greatest technology nodes coupled with a largely unconstrained budget for acquiring intellectual property (IP) blocks from trusted third-party vendors. However, many engineers are not so privileged. For every “spare no expense” project, there are a thousand “do the best you can with a limited budget” counterparts.

One way to squeeze the most performance out of lower-cost, earlier generation, mid-range processor and accelerator cores is to employ the judicious application of caches.

Cutting costs

A simplified example of a typical cost-conscious SoC scenario is illustrated in figure 1. Although the SoC may be composed of many IPs, only three are shown here for clarity.

Fig. 1: Portion of a cost-conscious, non-cache-coherent SoC. (Source: Arteris)

The predominant technology for connecting the IPs inside an SoC is network-on-chip (NoC) interconnect IP. This may be thought of as an IP that spans the entire device. The example shown in figure 1 may be assumed to reflect a non-cache-coherent scenario. In this case, any coherency requirements will be handled in software.

Let’s assume the SoC’s clock is running at 1GHz. Suppose a central processing unit (CPU) based on a reduced instruction set computer (RISC) architecture running a typical instruction will consume a single clock cycle. However, access to external DRAM memory can take anywhere between 100 and 200 processor clock cycles (we’ll average this out to be 150 cycles for the purposes of this article). This means that if the CPU lacked a Level 1 (L1) cache and was connected directly to the DRAM via the NoC and DDR memory controller, each instruction would consume 150 processor clock cycles, resulting in a CPU utilization of only 1/150 = 0.67%.

This is why CPUs, along with some accelerators and other IPs, employ cache memories to increase processor utilization and application performance. The underlying premise upon which the cache concept is based is the principle of locality. The idea is that only a small amount of the main memory is being employed at any given time and that locations in that space are being accessed multiple times. Mainly due to loops, nested loops and subroutines, instructions and their associated data experience temporal, spatial and sequential locality. This means that once a block of instructions and data have been copied from the main memory into an IP’s cache, the IP will typically access them repeatedly.

Today’s high-end CPU IPs usually have a minimum of a Level 1 (L1) and Level 2 (L2) cache, and they often have a Level 3 (L3) cache. Also, some accelerator IPs, like graphics processing units (GPUs) often have their own internal caches. However, these latest-generation high-end IPs often have a 5X to 10X price compared to their previous-generation mid-range counterparts. As a result, as illustrated in figure 1, the CPU in a cost-conscious SoC may come equipped with only an L1 cache.

Let’s consider the CPU and its L1 cache in a little more depth. When the CPU requests something in its cache, the result is called a cache hit. Since the L1 cache typically runs at the same speed as the processor core, a cache hit will be processed in a single processor clock cycle. By comparison, if the requested data is not in the cache, the result, called a cache miss, will require access to the main memory, which will consume 150 processor clock cycles.

Now consider running 1,000,000 instructions. If the cache were large enough to contain the whole program, then this would consume only 1,000,000 clock cycles, resulting in a CPU efficiency of 1,000,000 instructions/1,000,000 clock cycles = 100%.

Unfortunately, the L1 cache in a mid-range CPU will typically be only 16KB to 64KB in size. If we assume a 95% cache hit rate, then 950,000 of our 1,000,000 instructions will take one processor clock cycle. The remaining 50,000 instructions will each consume 150 clock cycles. Thus, the CPU efficiency in this case can be calculated as 1,000,000/((950,000 * 1) + (50,000 * 150)) = ~12%.

Turbocharging performance

A cost-effective way of turbocharging the performance of a cost-conscious SoC is to add cache IPs. For example, CodaCache from Arteris is a configurable, standalone non-coherent cache IP. Each CodaCache instance can be up to 8MB in size, and multiple copies can be instantiated in the same SoC, as demonstrated in figure 2.

Fig. 2: Portion of a turbocharged, non-cache-coherent SoC. (Source: Arteris)

It is not the intention of this article to suggest that every IP should be equipped with a CodaCache. Figure 2 is intended only to provide examples of potential CodaCache deployments.

If a CodaCache instance is associated with an IP, it’s known as a dedicated cache (DC). Alternatively, if a CodaCache instance is associated with a DDR memory controller, it’s referred to as a last-level cache (LLC). A DC will accelerate the performance of the IP with which it is associated, while an LLC will enhance the performance of the entire SoC.

As an example of the type of performance boost we might expect, consider the CPU shown in figure 2. Let’s assume the CodaCache DC instance associated with this IP is running at half the processor speed and that any accesses to this cache consume 20 processor clock cycles. If we also assume a 95% cache hit rate for this DC, then—for 1,000,000 instructions—our overall CPU+L1+DC efficiency can be calculated as 1,000,000/((950,000 * 1) + (47,500 * 20) + (2,500 * 150)) = ~44%. That’s a performance boost of ~273%!

Conclusion

In the past, embedded programmers relished the challenge of squeezing the highest performance possible out of small processors with low clock speeds and limited memory resources. In fact, it was common for computer magazines to issue challenges to their readers along the lines of, “Who can perform task X on processor Y in the minimum number of clock cycles using the smallest amount of memory?”

Today, many SoC developers enjoy the challenge of squeezing the highest performance possible out of their designs, especially if they are constrained to use lower-performing mid-range IPs. Deploying CodaCache IPs as dedicated and last-level caches provides an affordable way for engineers to turbocharge their cost-conscious SoCs. To learn more about CodaCache from Arteris, visit arteris.com.

The post Turbocharging Cost-Conscious SoCs With Cache appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • AI For Data ManagementAdam Kovac
    Data management is becoming a significant new challenge for the chip industry, as well as a brand new opportunity, as the amount of data collected at every step of design through manufacturing continues to grow. Exacerbating the problem is the rising complexity of designs, many of which are highly customized and domain-specific at the leading edge, as well as increasing demands for reliability and traceability. There also is a growing focus on chiplets developed using different processes, includ
     

AI For Data Management

30. Květen 2024 v 09:03

Data management is becoming a significant new challenge for the chip industry, as well as a brand new opportunity, as the amount of data collected at every step of design through manufacturing continues to grow.

Exacerbating the problem is the rising complexity of designs, many of which are highly customized and domain-specific at the leading edge, as well as increasing demands for reliability and traceability. There also is a growing focus on chiplets developed using different processes, including some from different foundries, and new materials such as glass substrates and ruthenium interconnects. On the design side, EDA and verification tools can generate terabytes of data on a weekly or even a daily basis, unlike in the past when this was largely done on a per-project basis.

While more data can be used to provide insights into processes and enable better designs, it’s an ongoing challenge to manage the current volumes being generated. The entire industry must rethink some well-proven methodologies and processes, as well as invest in a variety of new tools and approaches. At the same time, these changes are generating concern in an industry used to proceeding cautiously, one step at a time, based on silicon- and field-proven strategies. Increasingly, AI/ML is being added into design tools to identify anomalies and patterns in large data sets, and many of those tools are being regularly updated as algorithms are updated and new features are added, making it difficult to know exactly when and where to invest, which data to focus on, and with whom to share it.

“Every company has its own design flow, and almost every company has its own methodology around harvesting that data, or best practices about what reports should or should not be written out at what point,” said Rob Knoth, product management director in Cadence’s Digital & Signoff group. “There’s a death by 1,000 cuts that can happen in terms of just generating titanic volumes of data because, in general, disk space is cheap. People don’t think about it a lot, and they’ll just keep generating reports. The problem is that just because you’re generating reports doesn’t mean you’re using them.”

Fig. 1: Rising design complexity is driving increased need for data management. Source: IEEE Rising Stars 2022/Cadence

As with any problem in chip design, there is opportunity in figuring out a path forward. “You can always just not use the data, and then you’re back where you started,” said Tony Chan Carusone, CTO at Alphawave Semi. “The reason it becomes a problem for organizations is because they haven’t architected things from the beginning to be scalable, and therefore, to be able to handle all this data. Now, there’s an opportunity to leverage data, and it’s a different way. So it’s disruptive because you have to tear things apart, from re-architecting systems and processes to how you collect and store data, and organize it in order to take advantage of the opportunity.”

Buckets of data, buckets of problems
The challenges that come with this influx of data can be divided into three buckets, said Jim Schultz, senior staff product manager at Synopsys. The first is figuring out what information is actually critical to keep. “If you make a run, designers tend to save that run because if they need to do a follow up run, they have some data there and they may go, ‘Okay, well, what’s the runtime? How long did that run take, because my manager is going to ask me what I think the runtime is going to be on the next project or the next iteration of the block. While that data may not be necessary, designers and engineers have a tendency to hang onto it anyway, just in case.”

The second challenge is that once the data starts to pour in, it doesn’t stop, raising questions about how to manage collection. And third, once the data is collected, how can it be put to best use?

“Data analytics have been around with other types of companies exploring different types of data analytics, but the differences are those are can be very generic solutions,” said Schultz. “What we need for our industry is going to be very specific data analytics. If I have a timing issue, I want you to help me pinpoint what the cause of that timing violation is. That’s very specific to what we do in EDA. When we talk about who is cutting through the noise, we don’t want data that’s just presented. We want the data that is what the designer most cares about.”

Data security
The sheer number of tools being used and companies and people involved along the design pathway raises another challenge — security.

“There’s a lot of thought and investment going into the security aspect of data, and just as much as the problem of what data to save and store is the type of security we have to have without hindering the user day-to-day,” said Simon Rance, director of product management at Keysight. “That’s becoming a bigger challenge. Things like the CHIPS Act and the geopolitical scenarios we have at the moment are compounding that problem because a lot of the companies that used to create all these devices by themselves are having to collaborate, even with companies in different regions of the globe.”

This requires a balancing act. “It’s almost like a recording studio where you have all these knobs and dials to fine tune it, to make sure we have security of the data,” said Rance. “But we’re also able to get the job done as smoothly and as easily as we can.”

Further complicating the security aspect is that designing chips is not a one-man job. As leading-edge chips become increasingly complex and heterogeneous, they can involve hundreds of people in multiple companies.

“An important thing to consider when you’re talking about big data and analytics is what you’re going to share and with whom you’re going to share it,” said Synopsys’ Schultz. “In particular, when you start bringing in and linking data from different sources, if you start bringing in data related to silicon performance, you don’t want everybody to have access to that data. So the whole security protocol is important.”

Even the mundane matters — having a ton of data makes it likely, at some point, that data will be moved.

“The more places the data has to be transferred to, the more delays,” said Rance. “The bigger the data set, the longer it takes to go from A to B. For example, a design team in the U.S. may be designing during the day. Then, another team in Singapore or Japan will pick up on that design in their time zone, but they’re across the world. So you’re going to have to sync the data back and forth between these kinds of design sites. The bigger the data, the harder to sync.”

Solutions
The first step toward solving the issue of too much data is figuring out what data is actually needed. Rance said his team has found success using smart algorithms that help figure out which data is essential, which in turn can help optimize storage and transfer times.

There are less technical problems that can rear their heads, as well. Gina Jacobs, head of global communications and brand marketing at Arteris, said that engineers who use a set methodology — particularly those who are used to working on a problem by themselves and “brute forcing” a solution – also can find themselves overwhelmed by data.

“Engineers and designers can also switch jobs, taking with them institutional knowledge,” Jacobs said. “But all three problems can be solved with a single solution — having data stored in a standardized way that is easily accessible and sortable. It’s about taking data and requirements and specifications in different forms and then having it in the one place so that the different teams have access to it, and then being able to make changes so there is a single source of truth.”

Here, EDA design and data management tools are increasingly relying on artificial intelligence to help. Schultz forecasted a future where generative AI will touch every facet of chip development. “Along with that is the advanced data analytics that is able to mine all of that data you’ve been collecting, instead of going beyond the simple things that people have been doing, like predicting how long runtime is going to be or getting an idea what the performance is going to be,” he said. “Tools are going to be able to deal with all of that data and recognize trends much faster.”

Still, those all-encompassing AI tools, capable of complex analysis, are still years away. Cadence’s Knoth said he’s already encountered clients that are reluctant to bring it into the mix due to fears over the costs involved in disk space, compute resources, and licenses. Others, however, have been a bit more open-minded.

“Initially, AI can use a lot of processors to generate a lot of data because it’s doing a lot of things in parallel when it’s doing the inferencing, but it usually gets to the result faster and more predictably,” he said. So while a machine learning algorithm may generate even more vast amounts of data, on top of the piles currently available, “a good machine learning algorithm could be watching and smartly killing or restarting jobs where needed.”

As for the humans who are still an essential component to chip design, Alphawave’s Carusone said hardware engineers should take a page from lessons learned years ago from their counterparts in the software development world.

These include:

  • Having an organized and automated way to collect data, file it in a repository, and not do anything manually;
  • Developing ways to run verification and lab testing and everything in between in parallel, but with the data organized in a way that can be mined; and
  • Creating methods for rigorously checking in and out of different test cases that you want to consider.

“The big thing is you’ve got all this data collected, but then what is each of each of those files, each of those collections of data?” said Carusone. “What does that correspond to? What test conditions was that collected in? The software community dealt with that a while ago, and the hardware community also needs to have this under its belt, taking it to the next level and recognizing we really need to be able to do this en masse. We need to be able to have dozens of people work in parallel, collecting data and have it all on there. We can test a big collection of our designs in the lab without anyone having to touch a thing, and then also try refinements of the firmware, scale them out, then have all the data come in and be analyzed. Being able to have all that done in an automated way lets you track down and fix problems a lot more quickly.”

Conclusion
The influx of new tools used to analyze and test chip designs has increased productivity, but those designs come with additional considerations. Institutions and individual engineers and designers have never had access to so much data, but that data is of limited value if it’s not used effectively.

Strategies to properly store and order that data are essential. Some powerful tools are already in place to help do that, and the AI revolution promises to make even more powerful resources available to quickly cut down on the time needed to run tests and analyze the results.

For now, handling all that data remains a tricky balance, according to Cadence’s Knoth. “If this was an easy problem, it wouldn’t be a problem. Being able to communicate effectively, hierarchically — not just from a people management perspective, but also hierarchically from a chip and project management perspective — is difficult. The teams that do this well invest resources into that process, specifically the communication of top-down tightening of budgets or top-down floorplan constraints. These are important to think about because every engineer is looking at chip-level timing reports, but the problem that they’re trying to solve might not ever be visible. But if they have a report that says, ‘Here is your view of what your problems are to solve,’ you can make some very effective work.”

Further Reading
EDA Pushes Deeper Into AI
AI is both evolutionary and revolutionary, making it difficult to assess where and how it will be used, and what problems may crop up.
Optimizing EDA Cloud Hardware And Workloads
Algorithms written for GPUs can slice simulation time from weeks to hours, but not everything is optimized or benefits equally.

The post AI For Data Management appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Will Domain-Specific ICs Become Ubiquitous?Brian Bailey
    Questions are surfacing for all types of design, ranging from small microcontrollers to leading-edge chips, over whether domain-specific design will become ubiquitous, or whether it will fall into the historic pattern of customization first, followed by lower-cost, general-purpose components. Custom hardware always has been a double-edged sword. It can provide a competitive edge for chipmakers, but often requires more time to design, verify, and manufacture a chip, which can sometimes cost a mar
     

Will Domain-Specific ICs Become Ubiquitous?

16. Květen 2024 v 09:05

Questions are surfacing for all types of design, ranging from small microcontrollers to leading-edge chips, over whether domain-specific design will become ubiquitous, or whether it will fall into the historic pattern of customization first, followed by lower-cost, general-purpose components.

Custom hardware always has been a double-edged sword. It can provide a competitive edge for chipmakers, but often requires more time to design, verify, and manufacture a chip, which can sometimes cost a market window. In addition, it’s often too expensive for all but the most price-resilient applications. This is a well-understood equation at the leading edge of design, particularly where new technologies such as generative AI are involved.

But with planar scaling coming to an end, and with more features tailored to specific domains, the chip industry is struggling to figure out whether the business/technical equation is undergoing a fundamental and more permanent change. This is muddied further by the fact that some 30% to 35% of all design tools today are being sold to large systems companies for chips that will never be sold commercially. In those applications, the collective savings from improved performance per watt may dwarf the cost of designing, verifying, and manufacturing a highly optimized multi-chip/multi-chiplet package across a large data center, leaving the debate about custom vs. general-purpose more uncertain than ever.

“If you go high enough in the engineering organization, you’re going to find that what people really want to do is a software-defined whatever it is,” says Russell Klein, program director for high-level synthesis at Siemens EDA. “What they really want to do is buy off-the-shelf hardware, put some software on it, make that their value-add, and ship that. That paradigm is breaking down in a number of domains. It is breaking down where we need either extremely high performance, or we need extreme efficiency. If we need higher performance than we can get from that off-the-shelf system, or we need greater efficiency, we need the battery to last longer, or we just can’t burn as much power, then we’ve got to start customizing the hardware.”

Even the selection of processing units can make a solution custom. “Domain-specific computing is already ubiquitous,” says Dave Fick, CEO and cofounder of Mythic. “Modern computers, whether in a laptop, phone, security camera, or in farm equipment, consist of a mix of hardware blocks co-optimized with software. For instance, it is common for a computer to have video encode or decode hardware units to allow a system to connect to a camera efficiently. It is common to have accelerators for encryption so that we can safely communicate. Each of these is co-optimized with software algorithms to make commonly used functions highly efficient and flexible.”

Steve Roddy, chief marketing officer at Quadric, agrees. “Heterogeneous processing in SoCs has been de rigueur in the vast majority of consumer applications for the past two decades or more.  SoCs for mobile phones, tablets, televisions, and automotive applications have long been required to meet a grueling combination of high-performance plus low-cost requirements, which has led to the proliferation of function-specific processors found in those systems today.  Even low-cost SoCs for mobile phones today have CPUs for running Android, complex GPUs to paint the display screen, audio DSPs for offloading audio playback in a low-power mode, video DSPs paired with NPUs in the camera subsystem to improve image capture (stabilization, filters, enhancement), baseband DSPs — often with attached NPUs — for high speed communications channel processing in the Wi-Fi and 5G subsystems, sensor hub fusion DSPs, and even power-management processors that maximize battery life.”

It helps to separate what you call general-purpose and what is application-specific. “There is so much benefit to be had from running your software on dedicated hardware, what we call bespoke silicon, because it gives you an advantage over your competitors,” says Marc Swinnen, director of product marketing in Ansys’ Semiconductor Division. “Your software runs faster, lower power, and is designed to run specifically what you want to run. It’s hard for a competitor with off-the-shelf hardware to compete with you. Silicon has become so central to the business value, the business model, of many companies that it has become important to have that optimized.”

There is a balance, however. “If there is any cost justification in terms of return on investment and deployment costs, power costs, thermal costs, cooling costs, then it always makes sense to build a custom ASIC,” says Sharad Chole, chief scientist and co-founder of Expedera. “We saw that for cryptocurrency, we see that right now for AI. We saw that for edge computing, which requires extremely ultra-low power sensors and ultra-low power processes. But there also has been a push for general-purpose computing hardware, because then you can easily make the applications more abstract and scalable.”

Part of the seeming conflict is due to the scope of specificity. “When you look at the architecture, it’s really the scope that determines the application specificity,” says Frank Schirrmeister, vice president of solutions and business development at Arteris. “Domain-specific computing is ubiquitous now. The important part is the constant moving up of the domain specificity to something more complex — from the original IP, to configurable IP, to subsystems that are configurable.”

In the past, it has been driven more by economics. “There’s an ebb and a flow to it,” says Paul Karazuba, vice president of marketing at Expedera. “There’s an ebb and a flow to putting everything into a processor. There’s an ebb and a flow to having co-processors, augmenting functions that are inside of that main processor. It’s a natural evolution of pretty much everything. It may not necessarily be cheaper to design your own silicon, but it may be more expensive in the long run to not design your own silicon.”

An attempt to formalize that ebb and flow was made by Tsugio Makimoto in the 1990s, when he was Sony’s CTO. He observed that electronics cycled between custom solutions and programmable ones approximately every 10 years. What’s changed is that most custom chips from the time of his observation contained highly programmable standard components.

Technology drivers
Today, it would appear that technical issues will decide this. “The industry has managed to work around power issues and push up the thermal envelope beyond points I personally thought were going to be reasonable, or feasible,” says Elad Alon, co-founder and CEO of Blue Cheetah. “We’re hitting that power limit, and when you hit the power limit it drives you toward customization wherever you can do it. But obviously, there is tension between flexibility, scalability, and applicability to the broadest market possible. This is seen in the fast pace of innovation in the AI software world, where tomorrow there could be an entirely different algorithm, and that throws out almost all the customizations one may have done.”

The slowing of Moore’s Law will have a fundamental influence on the balance point. “There have been a number of bespoke silicon companies in the past that were successful for a short period of time, but then failed,” says Ansys’ Swinnen. “They had made some kind of advance, be it architectural or addressing a new market need, but then the general-purpose chips caught up. That is because there’s so much investment in them, and there’s so many people using them, there’s an entire army of people advancing, versus your company, just your team, that’s advancing your bespoke solution. Inevitably, sooner or later, they bypass you and the general-purpose hardware just gets better than the specific one. Right now, the pendulum has swung toward custom solutions being the winner.”

However, general-purpose processors do not automatically advance if companies don’t keep up with adoption of the latest nodes, and that leads to even more opportunities. “When adding accelerators to a general-purpose processor starts to break down, because you want to go faster or become more efficient, you start to create truly customized implementations,” says Siemens’ Klein. “That’s where high-level synthesis starts to become really interesting, because you’ve got that software-defined implementation as your starting point. We can take it through high-level synthesis (HLS) and build an accelerator that’s going to do that one specific thing. We could leave a bunch of registers to define its behavior, or we can just hard code everything. The less general that system is, the more specific it is, usually the higher performance and the greater efficiency that we’re going to take away from it. And it almost always is going to be able to beat a general-purpose accelerator or certainly a general-purpose processor in terms of both performance and efficiency.”

At the same time, IP has become massively configurable. “There used to be IP as the building blocks,” says Arteris’ Schirrmeister. “Since then, the industry has produced much larger and more complex IP that takes on the role of sub-systems, and that’s where scope comes in. We have seen Arm with what they call the compute sub-systems (CSS), which are an integration and then hardened. People care about the chip as a whole, and then the chip and the system context with all that software. Application specificity has become ubiquitous in the IP space. You either build hard cores, you use a configurable core, or you use high-level synthesis. All of them are, by definition, application-specific, and the configurability plays in there.”

Put in perspective, there is more than one way to build a device, and an increasing number of options for getting it done. “There’s a really large market for specialized computing around some algorithm,” says Klein. “IP for that is going to be both in the form of discrete chips, as well as IP that could be built into something. Ultimately, that has to become silicon. It’s got to be hardened to some degree. They can set some parameters and bake it into somebody’s design. Consider an Arm processor. I can configure how many CPUs I want, I can configure how big I want the caches, and then I can go bake that into a specific implementation. That’s going to be the thing that I build, and it’s going to be more targeted. It will have better efficiency and a better cost profile and a better power profile for the thing that I’m doing. Somebody else can take it and configure it a little bit differently. And to the degree that the IP works, that’s a great solution. But there will always be algorithms that don’t have a big enough market for IP to address. And that’s where you go in and do the extreme customization.”

Chiplets
Some have questioned if the emerging chiplet industry will reverse this trend. “We will continue to see systems composed of many hardware accelerator blocks, and advanced silicon integration technologies (i.e., 3D stacking and chiplets) will make that even easier,” says Mythic’s Fick. “There are many companies working on open standards for chiplets, enabling communication bandwidth and energy efficiency that is an order of magnitude greater than what can be built on a PCB. Perhaps soon, the advanced system-in-package will overtake the PCB as the way systems are designed.”

Chiplets are not likely to be highly configurable. “Configuration in the chiplet world might become just a function of switching off things you don’t need,” says Schirrmeister. “Configuration really means that you do not use certain things. You don’t get your money back for those items. It’s all basically applying math and predicting what your volumes are going to be. If it’s an incremental cost that has one more block on it to support another interface, or making the block the Ethernet block with time triggered stuff in it for automotive, that gives you an incremental effort of X. Now, you have to basically estimate whether it also gives you a multiple of that incremental effort as incremental profit. It works out this way because chips just become very configurable. Chiplets are just going in the direction or finding the balance of more generic usage so that you can apply them in more chiplet designs.”

The chiplet market is far from certain today. “The promise of chiplets is that you use only the function that you want from the supplier that you want, in the right node, at the right location,” says Expedera’s Karazuba. “The idea of specialization and chiplets are at arm’s length. They’re actually together, but chiplets have a long way to go. There’s still not that universal agreement of the different things around a chiplet that have to be in order to make the product truly mass market.”

While chiplets have been proven to work, nearly all of the chiplets in use today are proprietary. “To build a viable [commercial] chiplet company, you have to be going after a broad enough market, large enough from a dollar perspective, then you can make all the investment, have success and get everything back accordingly,” says Blue Cheetah’s Alon. “There’s a similar tension where people would like to build a general-purpose chiplet that can be used anywhere, by anyone. That is the plug-and-play discussion, but you could finish up with something that becomes so general-purpose, with so much overhead, that it’s just not attractive in any particular market. In the chiplet case, for technical reasons, it might not actually really work that way at all. You might try to build it for general purpose, and it turns out later that it doesn’t plug into particular sockets that are of interest.”

The economics of chiplet viability have not yet been defined. “The thing about chiplets is they can be small,” says Klein. “Being small means that we don’t need as big a market for them as we would for a very large chip. We can also build them on different technologies. We can have some that are on older technologies, where transistors are cheaper, and we can combine those with other chiplets that might be leading-edge nodes where we could have general-purpose CPUs or NPU accelerators. There’s a mix-and-match, and we can do chiplets smaller than we can general-purpose chips. We can do smaller runs of them. We can take that IP and customize it for a particular market vertical and create some chiplets for that, change the configuration a bit, and do another run for something else. There’s a level of customization that can be deployed and supported by the market that’s a little bit more than we’ve seen in full-size chips, where the entire thing has to be built into one package.

Conclusion
What it means for a design to be general-purpose or custom is changing. All designs will contain some of each. Some companies will develop novel architectures using general-purpose processors, and these will be better than a fully general-purpose solution. Others will create highly customized hardware for some functions that are known to be stable, and general purpose for things that are likely to change. One thing has never changed, however. A company is not likely to add more customization than necessary to satisfy the needs of the market they are targeting.

Further Reading
Challenges With Chiplets And Power Delivery
Benefits and challenges in heterogeneous integration.
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.

The post Will Domain-Specific ICs Become Ubiquitous? appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Commercial Chiplet Ecosystem May Be A Decade AwayAnn Mutschler
    Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packagi
     

Commercial Chiplet Ecosystem May Be A Decade Away

29. Únor 2024 v 09:08

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

L-R: Arteris’ Schirrmeister, Cadence’s Bhatnagar, Expedera’s Karazuba, Keysight’s Slater, Siemens EDA’s Rinebold, and Synopsys’ Posner.

SE: There’s a lot of buzz and activity around every aspect of chiplets today. What is your impression of where the commercial chiplet ecosystem stands today?

Schirrmeister: There’s a lot of interest today in an open chiplet ecosystem, but we are probably still quite a bit away from true openness. The proprietary versions of chiplets are alive and kicking out there. We see them in designs. We vendors are all supporting those to make it reality, like the UCIe proponents, but it will take some time to get to a fully open ecosystem. It’s probably at least three to five years before we get to a PCI Express type exchange environment.

Bhatnagar: The commercial chiplet ecosystem is at a very early stage. Many companies are providing chiplets, are designing them, and they’re shipping products — but they’re still single-vendor products, where the same company is designing all the pieces. I hope that with the advancements the UCIe standard is making, and with more standardization, we eventually can get to a marketplace-like environment for chiplets. We are not there.

Karazuba: The commercialization of homogeneous chiplets is pretty well understood by groups like AMD. But for the commercialization of heterogeneous chiplets, which is chiplets from multiple suppliers, there are still a lot of questions out there about that.

Slater: We participate in a lot of the board discussions, and attend industry events like TSMC’s OIP, and there’s a lot of excitement out there at the moment. I see a lot of even midsize and small customers starting to think about their development plans for what chiplet should be. I do think those that are going to be successful first will be those that are within a singular foundry ecosystem like TSMC’s. Today if you’re selecting your IP, you’ve got a variety of ways to be able to pick and choose which IP, see what’s been taped out before, how successful it’s been so you have a way to manage your risk and your costs as you’re putting things together. What we’ll see in the future will be that now you have a choice. Are you purchasing IP, or are you purchasing chiplets? Crucially, it’s all coming from the same foundry and put together in the same manner. The technical considerations of things like UCIe standard packaging versus advanced packaging, and the analysis tool sets for high-speed simulation, as well as for things like thermal, are going to just become that much more important.

Rinebold: I’ve been doing this about 30 years, so I can date back to some of the very earliest days of multi-chip modules and such. When we talk about the ecosystem, there are plenty of examples out there today where we see HBM and logic getting combined at the interposer level. This works if you believe HBM is a chiplet, and that’s a whole other argument. Some would argue that HBM falls into that category. The idea of a true LEGO, snap-together mix and match of chiplets continues to be aspirational for the mainstream market, but there are some business impediments that need to get addressed. Again, there are exceptions in some of the single-vendor solutions, where it’s more or less homogeneous integration, or an entirely vertically integrated type of environment where single vendors are integrating their own chiplets into some pretty impressive packages.

Posner: Aspirational is the word we use for an open ecosystem. I’m going to be a little bit more of a downer by saying I believe it’s 5 to 10 years out. Is it possible? Absolutely. But the biggest issue we see at the moment is a huge knowledge gap in what that really means. And as technology companies become more educated on really what that means, we’ll find that there will be some acceleration in adoption. But within what we call ‘captive’ — within a single company or a micro-ecosystem — we’re seeing multi-die systems pick up.

SE: Is it possible to define the pieces we have today from a technology point of view, to make a commercial chiplet ecosystem a reality?

Rinebold: What’s encouraging is the development of standards. There’s some adoption. We’ve already mentioned UCIe for some of the die-to-die protocols. Organizations like JEDEC announced the extension of their JEP30 PartModel format into the chiplet ecosystem to incorporate chiplet-style data. Think about this as an electronic data sheet. A lot of this work has been incorporated into the CDX working group under Open Compute. That’s encouraging. There were some comments a little bit earlier about having an open marketplace. I would agree we’re probably 3 to 10 years away from that coming to fruition. The underlying framework and infrastructure is there, but a lot of the licensing and distribution issues have to get resolved before you see any type of broad adoption.

Posner: The infrastructure is available. The EDA tools to create, to package, to analyze, to simulate, to manufacture — those tools are all there. The intellectual property that sits around it, either UCIe or some of the more traditional die-to-die interfaces, all of that’s there. What’s not established are full methodology and flows that lead to interoperability. Everything within captive is possible, but a broader ecosystem, a marketplace, is going to require silicon interoperability, simulation, packaging, all of that. That’s the area that we believe is missing — and still building.

Schirrmeister: Do we know what’s required? We probably can define that reasonably well. If the vision is an open ecosystem with IP on chiplets that you can just plug together like LEGO blocks, then the IP industry informs us of what’s required, and then there are some gaps on top of them. I hear people from the hard-coded IP world talking about the equivalent of PDKs for chiplets, but today’s IP ecosystem and the IP deliverables are informing us it doesn’t work like LEGO blocks yet. We are improving every year. But this whole, ‘I take my whiteboard and then everything just magically functions together’ is not what we have today. We need to think really hard about what the additional challenges are when you disaggregate that into chiplets and protocols. Then you get big systemic issues to deal with, like how do you deal with coherency across chiplets? It was challenging enough to get it done on a chip. Now you potentially have to deal with other partnerships you don’t even own. This is not a captive environment in an open ecosystem. That makes it very challenging, and it creates job security for at least 5 to 10 years.

Bhatnagar: On the technical side, what’s going well is adoption. We can see big companies like Intel, and then of course, IP providers like us and Synopsys. Everybody’s working toward standardizing chiplet integration, and that is working very well. EDA tools are also coming up to support that. But we are still very far from a marketplace because there are many issues that are not sorted out, like licensing and a few other things that need a bit more time.

Slater: The standards bodies and networking groups have excited a lot of people, and we’re getting a broad set of customers that are coming along. And one point I was thinking, is this only for very high-end compute? From the companies that I see presenting in those types of forums, it’s even companies working in automotive or aerospace/defense, planning out their future for the next 10 years or more. In the automotive case, it was a company that was thinking about creating chiplets for internal consumption — so maybe reorganizing how they look at creating many different variations or evolutions of their products, trying to do it as more modular chiplet types of blocks. ‘If we take the microprocessor part of it, would we sell that as a chiplet externally for other customers to integrate together into a bigger design?’ For me, the aha moment was seeing how broad the application would be. I do think that the standards work has been moving very fast, and that’s worked really well. For instance, at Keysight EDA, we just released a chiplet PHY designer. It’s a simulation for the high-speed digital link for UCIe, and that only comes about by having a standard that’s published, so an EDA company can take a look at it and realize what they need to do with it. The EDA tools are ready to handle these kinds of things. And maybe then, on to the last point is, in order to share the IP, in order to ensure that it’s available, database and process management is going to become all the more important. You need to keep track of which chip is made on which process, and be able to make it available inside the company to other potential users of that.

SE: What’s in place today from a business perspective, and what still needs to be worked out?

Karazuba: From a business perspective, speaking strictly of heterogeneous chiplets, I don’t think there’s anything really in place. Let me qualify that by asking, ‘Who is responsible for warranty? Who is responsible for testing? Who is responsible for faults? Who is responsible for supply chain?’ With homogeneous chiplets or monolithic silicon, that’s understood because that’s the way that this industry has been doing business since its inception. But when you talk about chiplets that are coming from multiple suppliers, with multiple IPs — and perhaps different interfaces, made in multiple fabs, then constructed by a third party, put together by a third party, tested by a fourth party, and then shipped — what happens when something goes wrong? Who do you point the finger at? Who do you go to and talk to? If a particular chiplet isn’t functioning as intended, it’s not necessarily that chiplet that’s bad. It may be the interface on another chiplet, or on a hub, whatever it might be. We’re going to get there, but right now that’s not understood. It’s not understood who is going to be responsible for things such as that. Is it the multi-chip module manufacturer, or is it the person buying it? I fear a return to the Wintel issue, where the chipmaker points to the OS maker, which points at the hardware maker, which points at the chipmaker. Understanding of the commercial side is is a real barrier to chiplets being adopted. Granted, the technical is much more difficult than the commercial, but I have no doubt the engineers will get there quicker than the business people.

Rinebold: I completely agree. What are the repercussions, warranty-related issues, things like that? I’d also go one step further. If you look at some of the larger silicon foundries right now, there is some concern about taking third-party wafers into their facilities to integrate in some type of heterogeneous, chiplet-type package. There are a lot of business and logistical issues that have to get addressed first. The technical stuff will happen quickly. It’s just a lot of these licensing- and distribution-type issues that need to get resolved. The other thing I want to back up to involves customers in the defense/industrial space. The trust and traceability and the province tracking of IP is going to be key for them, because they have so much expectation of multi-die or chiplet-type packaging as an alternative to monolithic scaling. Just look at all the government programs out there right now, with RESHAPE [Reshore Ecosystem for Secure Heterogeneous Advanced Packaging Electronics] and NGMM [Next-Generation Microelectronics Manufacturing] and such. They’re all in on this chiplet perspective, but they’re going to require a lot of security measures to understand who has touched the IP, where it comes from, how to you verify that.

Posner: Micro-ecosystems are forming because of all these challenges. If you naively think you can just go pick a die off the shelf and put it into your device, how do you warranty that? Who owns it? These micro-ecosystems are building up to fundamentally sort that out. So within a couple of different companies, be it automotive or high-performance compute, they’ll come to terms that are acceptable across all of them. And it’s these micro-ecosystems that are really going to end up driving open chiplets, and I think it’s going to be an organic type of growth. Chiplets are available for a specific application today, but we have this vision that someone else could use it, and we see that with the multiple modes being built into the dies. One mode is, ‘I’m connecting to myself. It’s a very tight, low-latency link.’ But I have this vision in the future that I’m going to need to have an interface or protocol that is more open and uses standard available stacks, and which can be bought off the shelf and integrated. That’s one side of the logistics. I want to mention two more things. It is possible to do interoperability across nodes. We demonstrated our TSMC N3 UCIe with Intel’s in-house UCIe, all put together on an Intel process. This was two separate companies working together, showing the first physical interoperability, so it’s possible. But going forward that’s still just a small part of the overall effort. In the IP space we’ve lived with an IP model of, ‘Build once, sell many.’ With the chiplet marketplace, unless there is a revenue stream from that chiplet, it will break that model. Companies think, ‘I only have to buy the IP once, and then I’m selling my silicon.’ But the infrastructure, the resources that are required to build all of this does not go away. There has to be money at the end of that tunnel for all of these different companies to be investing.

Schirrmeister: Mick is 100% right, but we may have a definition issue here with what we really mean by an ‘open’ chiplet ecosystem. I have two distinct conversations when I talk to partners and customers. On the one hand, you have board designers who are doing more and more integration, and they look at you with a wrinkled forehead and say, ‘We’ve been doing this for years. What are you talking about?’ It may not have been 3D-IC in the classic sense of all items, but they say, ‘Yeah, there are issues with warranties, and the user figures it out.’ The board people arrive from one side of the equation at chiplets because that’s the next evolution of integration. You need to be very efficient. That’s not what we call an open ecosystem of chiplets here. The idea is that you have this marketplace to mix things up, and you have the economies of scale by selling the same chiplet to multiple people. That’s really what the chip designers are thinking about, and some of them think even further because if you do it all in true 3D-IC fashion, then you actually have to co-design those chiplets in a way, and that’s a whole other dimension that needs to be sorted out. To pick a little bit on the big companies that have board and chip design groups in house, you see this even within the messaging of these companies. You have people who come from the board side, and for them it’s not a solved problem. It always has been challenging, but they’re going to take it to the next level. The chip guys are looking at this from a perspective of one interface, like PCI Express, now being UCIe. And then I think about this because the networks on chip need to become super NoCs across chiplets, which poses its own challenges. And that all needs to work together. But those are really chiplets designed for the purpose of being in a chiplet ecosystem. And to that end, Mick’s estimation of longer than five years is probably correct because those purpose-built chiplets, for the purpose of being in an open ecosystem, have all these challenges the board guys have already been dealing with for quite some time. They’re now ‘just getting smaller’ in the amount of integration they do.

Slater: When you put all these chiplets together and start to do that integration, in what order do you start placing the components down? You don’t want to throw away one very expensive chiplet because there was an issue with one of the smaller cheaper ones. So, there are now a lot of thoughts about how to go about doing almost like unit tests on individual chiplets first, but then you want to do some form of system test as you go along. That’s definitely something we need to think about. On the business front,  who is going to be most interested in purchasing a chiplet-style solution. It comes down to whether you have a yield problem. If your chips are getting to the size where you have yield concerns, then definitely it makes sense to think about using chiplets and breaking it up into smaller pieces. Not everything scales, so why move to the lowest process node when you could purchase something at a different process node that has less risk and costs less to manufacture, and then put it all together. The ones that have been successful so far — the big companies like Intel, AMD — were already driven to that edge. The chips got to a size that couldn’t fit on the reticle. We think about how many companies fit into that category, and that will factor into whether or not the cost and risk is worth it for them.

Bhatnagar: From a business perspective, what is really important is the standardization. Inside of the chiplets is fine, but how it impacts other chiplets around it is important. We would like to be able to make something and sell many copies of it. But if there is no standardization, then either we are taking a gamble by going for one thing and assuming everybody moves to it, or we make multiple versions of the same thing and that adds extra costs. To really justify a business case for any chiplet or, or any sort of IP with the chiplet, the standardization is key for the electrical interconnect, packaging, and all other aspects of a system.

Fig. 1:  A chiplet design. Source: Cadence. 

Related Reading
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.
Proprietary Vs. Commercial Chiplets
Who wins, who loses, and where are the big challenges for multi-vendor heterogeneous integration.

The post Commercial Chiplet Ecosystem May Be A Decade Away appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Accellera Preps New Standard For Clock-Domain CrossingBrian Bailey
    Part of the hierarchical development flow is about to get a lot simpler, thanks to a new standard being created by Accellera. What is less clear is how long will it take before users see any benefit. At the register transfer level (RTL), when a data signal passes between two flip flops, it initially is assumed that clocks are perfect. After clock-tree synthesis and place-and-route are performed, there can be considerable timing skew between the clock edges arriving those adjacent flops. That mak
     

Accellera Preps New Standard For Clock-Domain Crossing

29. Únor 2024 v 09:06

Part of the hierarchical development flow is about to get a lot simpler, thanks to a new standard being created by Accellera. What is less clear is how long will it take before users see any benefit.

At the register transfer level (RTL), when a data signal passes between two flip flops, it initially is assumed that clocks are perfect. After clock-tree synthesis and place-and-route are performed, there can be considerable timing skew between the clock edges arriving those adjacent flops. That makes timing sign-off difficult, but at least the clocks are still synchronous.

But if the clocks come from different sources, are at different frequencies, or a design boundary exists between the flip flops — which would happen with the integration of IP blocks — it’s impossible to guarantee that no clock edges will arrive when the data is unstable. That can cause the output to become unknown for a period of time. This phenomenon, known as metastability, cannot be eliminated, and the verification of those boundaries is known as clock-domain crossing (CDC) analysis.

Special care is required on those boundaries. “You have to compensate for metastability by ensuring that the CDC crossings follow a specific set of logic design principles,” says Prakash Narain, president and CEO of Real Intent. “The general process in use today follows a hierarchical approach and requires that the clock-domain crossing internal to an IP is protected and safe. At the interface of the IP, where the system connects with the IP, two different teams share the problem. An IP provider may recommend an integration methodology, which often is captured in an abstraction model. That abstraction model enables the integration boundary to be verified while the internals of it will not be checked for CDC. That has already been verified.”

In the past, those abstract models differentiated the CDC solutions from veracious vendors. That’s no longer the case. Every IP and tool vendor has different formats, making it costly for everyone. “I don’t know that there’s really anything new or differentiating coming down the pipe for hierarchical modeling,” says Kevin Campbell, technical product manager at Siemens Digital Industries Software. “The creation of the standard will basically deliver much faster results with no loss of quality. I don’t know how much more you can differentiate in that space other than just with performance increases.”

While this has been a problem for the whole industry for quite some time, Intel decided it was time for a solution. The company pushed Accellera to take up the issue, and helped facilitate the creation of the standard by chairing the committee. “I’m going to describe three methods of building a product,” says Iredamola “Dammy” Olopade, chair of the Accellera working group, and a principal engineer at Intel. “Method number one is where you build everything in a monolithic fashion. You own every line of code, you know the architecture, you use the tool of your choice. That is a thing of the past. The second method uses some IP. It leverages reuse and enables the quick turnaround of new SoCs. There used to be a time when all IPs came from the same source, and those were integrating into a product. You could agree upon the tools. We are quickly moving to a world where I need to source IPs wherever I can get them. They don’t use the same tools as I do. In that world, common standards are critical to integrating quickly.”

In some cases, there is a hierarchy of IP. “Clock-domain crossings are a central part of our business,” says Frank Schirrmeister, vice president of solutions and business development at Arteris. “A network-on-chip (NoC) can be considered as ‘CDC central’ because most blocks connected to the NoC have different clocks. Also, our SoC integration tools see all of the blocks to be integrated, and those touch various clock domains and therefore need to deal with the CDC code that is inserted.”

This whole thing can become very messy. “While every solution supports hierarchical modeling, every tool has its own model solution and its own model representation,” says Siemens’ Campbell. “Vendors, or users, are stuck with a CDC solution, because the models were created within a certain solution. There’s no real transportability between any of the hierarchical modeling solutions unless they want to go regenerate models for another solution.”

That creates a lot of extra work. “Today, when dealing with customer CDC issues, we have to consider the customer’s specific environment, and for CDC, a potential mix of in-house flows and commercial tools from various vendors,” says Arteris’ Schirrmeister. “The compatibility matrix becomes very complex, very fast. If adopted, the new Accellera CDC standard bears the potential to make it easier for IP vendors, like us, to ensure compatibility and reduce the effort required to validate IP across multiple customer toolsets. The intent, as specified in the requirements is that ‘every IP provider can run its tool of choice to verify and produce collateral and generate the standard format for SoCs that use a different tool.'”

Everyone benefits. “IP providers will not need to provide extra documentation of clock domains for the SoC integrator to use in their CDC analysis,” says Ahmed Nasr, digital design manager at Mixel. “The standard CDC attributes generated by the EDA tool will be self-contained.”

The use model is relatively simple. “An IP developer signs off on CDC and then exports the abstract model,” says Real Intent’s Narain. “It is likely they will write this out in both the Accellera format and the native format to provide backward compatibility. At the next level of hierarchy, you read in the abstract model instead of reading in the full view of the design. They have various views of the IP, including the CDC view of the IP, which today is on the basis of whatever tool they use for CDC sign-off.”

The potential is significant. “If done right and adopted, the industry may arrive at a common language to describe CDC aspects that can streamline the validation process across various tools and environments used by different users,” says Schirrmeister. “As a result, companies will be able to integrate and validate IP more efficiently than before, accelerating development cycles and reducing the complexity associated with SoC integration.”

The standard
Intel’s Olopade describes the approach that was taken during the creation of the standard. “You take the most complex situations you are likely to find, you box them, and you co-design them in order to reduce the risk of bugs,” he said. “The boundaries you create are supposed to be simple boundaries. We took that concept, and we brought it into our definition to say the following: ‘We will look at all kinds of crossings, we will figure out the simple common uses, and we will cover that first.’ That is expected to cover 95% to 98% of the community. We are not trying to handle 700 different exceptions. It is common. It is simple. It is what guarantees production quality, not just from a CDC standpoint, but just from a divide-and-conquer standpoint.”

That was the starting point. “Then we added elements to our design document that says, ‘This is how we will evaluate complexity, and this is how we’ll determine what we cover first,'” he says. “We broke things down into three steps. Step one is clock-domain crossing. Everyone suffers from this problem. Step two is reset-domain crossing (RDC). As low power is getting into more designs, there are a lot more reset domains, and there is risk between these reset domains. Some companies care, but many companies don’t because they are not in a power-aware environment. It became a secondary consideration. Beyond the basic CDC in phase one, and RDC in phase two, all other interesting, small usage complexities will be handled in phase three as extensions to the standard. We are not going to get bogged down supporting everything under the sun.”

Within the standards group there are two sub-groups — a mapping team and a format team. Common standards, such as AMBA, UCIe, and PCIe have been looked at to make sure that these are fully covered by the standard. That means that the concepts should be useful for future markets.

“The concepts contained in the standard are extensible to hardened chiplets,” says Mixel’s Nasr. “By providing an accurate standard CDC view for the chiplet, it will enable integration with other chiplets.”

Some of those issues have yet to be fully explored. “The standard’s current documentation primarily focuses on clock-domain crossing within an SoC itself,” says Schirrmeister. “Its direct applicability to the area of chiplets would depend on further developments. The interfaces between fully hardened IP blocks on chiplets would communicate through standard interfaces like UCIe, BoW, or XSR, so the synchronization issues between chiplets on substrates would appear to be elevated to the protocol levels.”

Reset-domain crossings have yet to appear in the standard. “The genesis of CDC is asynchronous clocks,” says Narain. “But the genesis for reset-domain crossing is asynchronous resets. While the destination is due to the clock, the source of the problem is somewhere else. And as a result, the nature of the problem, the methodology that people use to manage that problem, are very different. The kind of information that you need to retain, and the kind of information that you can throw away, is different for every problem. Hence, abstractions are actually very customized for the application.”

Does the standard cover enough ground? That is part of the purpose of the review period that was used to collect information. “I can see some room for future improvement — for example, making some attributes mandatory like logic, associated_clocks, clock_period for clock ports,” says Nasr. “Another proposed improvement is adding reconvergence information, to be able to detect reconverging outputs of parallel synchronizers.”

The impact of all of this, if realized, is enormous. “If you truly run a collaborative, inclusive, development cycle, two things will happen,” says Olopade. “One, you are going to be able to find multiple ways to solve each problem. You need to understand the pros and cons against the real problems you are trying to solve and agree on the best way we should do it together. For each of those, we record the options, the pros and cons, and the reason one was selected. In a public review, those that couldn’t be part of that discussion get to weigh in. We weigh it against what they are suggesting versus why did we choose it. In the cases where it is part of what we addressed, and we justified it, we just respond, and we do not make a change. If you’re truly inclusive, you do allow that feedback to cause you to change your mind. We received feedback on about three items that we had debated, where the feedback challenged the decisions and got us to rehash things.”

The big challenge
Still, the creation of a standard is just the first step. Unless a standard is fully adopted, its value becomes diminished. “It’s a commendable objective and a worthy endeavor,” says Schirrmeister. “It will make interoperability easier and eventually allow us, and the whole industry, to reduce the compatibility matrix we maintain to deal with vendor tools individually. It all will depend on adoption by the vendors, though.”

It is off to a good start. “As with any standard, good intentions sometimes get severed by reality,” says Campbell. “There has been significant collaboration and agreements on how the standard is being pushed forward. We did not see self-submarining, or some parties playing nice just to see what’s going on but not really supporting it. This does seem like good collaboration and good decision making across the board.”

Implementation is another hurdle. “Will it actually provide the benefit that it is supposed to provide?” asks Narain. “That will depend upon how completely and how quickly EDA tool vendors provide support for the standard. From our perception, the engineering challenge for implementing this is not that large. When this is standardized, we will provide support for it as soon as we can.”

Even then, adoption isn’t a slam dunk. “There are short- and long-term problems,” warns Campbell. “IP vendors already have to support multiple formats, but now you have to add Accellera on top of that. There’s going to be some pain both for the IP vendors and for EDA vendors. We are going to have to be backward-compatible and some programs go on for decades. There’s a chance that some of these models will be around for a very long time. That’s the short-term pain. But the biggest hurdle to overcome for a third-party IP vendor, and EDA vendor, is quality assurance. The whole point of a hierarchical development methodology is faster CDC closure with no loss in quality. The QA load here is going to be big, because no customer is going to want to take the risk if they’ve got a solution that is already working well.”

Some of those issues and fears are expected to be addressed at the upcoming DVCon conference. “We will be providing a tutorial on CDC,” says Olopade. “The first 30 minutes covers the basics of CDC for those who haven’t been doing this for the last 10 years. The next hour will talk about the Accellera solution. It will concentrate on those topics which were hotly debated, and we need to help people understand, or carry people along with what we recommend. Then it may become more acceptable and more adoptive.”

Related Reading
Design And Verification Methodologies Breaking Down
As chips become more complex, existing tools and methodologies are stretched to the breaking point.

The post Accellera Preps New Standard For Clock-Domain Crossing appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • NoC Development – Make Or Buy?Frank Schirrmeister
    In the selection and qualification process for semiconductor IP, design teams often consider the cost of in-house development. Network-on-Chip (NoC) IP is no different. In “When Does My SoC Design Need A NoC?” Michael Frank and I argued that most of today’s designs – even less complex ones – can benefit from NoCs. In the blog “Balancing Memory And Coherence: Navigating Modern Chip Architectures,” I discussed the complexity that coherency adds to on-chip interconnect. After I described some of th
     

NoC Development – Make Or Buy?

In the selection and qualification process for semiconductor IP, design teams often consider the cost of in-house development. Network-on-Chip (NoC) IP is no different. In “When Does My SoC Design Need A NoC?” Michael Frank and I argued that most of today’s designs – even less complex ones – can benefit from NoCs. In the blog “Balancing Memory And Coherence: Navigating Modern Chip Architectures,” I discussed the complexity that coherency adds to on-chip interconnect. After I described some of the steps of NoC development based on what ChatGPT 3.5 recommended in “Shortening Network-On-Chip Development Schedules Using Physical Awareness,” it’s time to look at more detail at the development efforts that design teams would have to invest to develop coherent NoCs from scratch.

ChatGPT, here we go again!

The prompt “Tell me how to develop an optimized network-on-chip for semiconductor design, considering the aspects of cache coherency” gives an excellent starting point in ChatGPT 4.0.

Understanding Protocols: First, one needs to understand cache coherency protocols. The recommendation is to study existing protocols before selecting one. Specifically, understand existing cache coherency protocols like MESI (Modified, Exclusive, Shared, Invalid), MOESI (Modified, Owned, Exclusive, Shared, Invalid), and directory-based protocols. Analyze their strengths and weaknesses in terms of scalability, latency, and bandwidth requirements. Then, choose a protocol that aligns with your performance goals and the scale of your NoC. Directory-based protocols are often preferred for larger-scale systems due to their scalability.

ChatGPT’s recommendation for the first step is a good start. I previously discussed the complexity of specific protocols like AMBA AXI, APB, ACE, CHI, OCP, CXL, and TileLink in “Design Complexity In The Golden Age Of Semiconductors.” One must read several thousand pages of documentation to understand the options here. And – by the way – these are orthogonal to the MESI/MOESI commentary from ChatGPT above, as these are implementation choices. In a practical scenario, many of these aspects depend on the building blocks the design team wants to license, like processors from the Arm, RISC-V, Arc, Tensilica, CEVA, and other ecosystems, as well as the protocol support in design IP blocks (think PCIe, UCIe, LPDDR) and accelerators for AI/ML.

NoC Architecture Design: Second, ChatGPT recommends focusing on NoC architecture design. Decide on the NoC topology (e.g., mesh, torus, tree, or custom designs) based on the expected traffic pattern and the scalability requirements. Each topology has its specific advantages, as my colleague Andy Nightingale recently explained here. Furthermore, teams must design efficient routers to handle the chosen cache coherency protocol with minimal latency, implementing features like virtual channels to avoid deadlock and increase throughput. The final part of this step involves optimizing the network for bandwidth and latency by tuning the buffer sizes, employing efficient routing algorithms, and optimizing link widths and speeds.

Cache Coherency Mechanism Integration: Next up, ChatGPT recommends integrating the actual mechanisms of cache coherency. Integrating the cache coherency mechanism with the NoC involves efficient propagation of coherency messages (e.g., invalidate, update) across the network with minimal latency. Designing an efficient directory structure for directory-based protocols that can scale with the network and minimize the coherency traffic requires careful considerations of the placement of directories and the granularity of coherence (e.g., block-level vs. cache-line level).

By the way, for my query, it leaves out the option to handle coherency fully in software.

Simulation and Analysis: At this point, ChatGPT correctly recommends using simulation tools to model your NoC design and evaluate its performance under various workloads. Tools like Gem5, NS-3, or custom simulators can be helpful. I would add SystemC models to the arsenal of tools design teams working on this from scratch could use. Teams need to analyze key performance metrics such as latency, throughput, and energy consumption and pay special attention to the performance of the cache coherency mechanisms.

The last bit is indeed critical as for coherent interconnects, the cost of a cache miss is drastically different from a cache hit.

Optimization and Scaling: This recommendation includes implementing adaptive routing algorithms and dynamic power management techniques to optimize performance and energy efficiency under varying workloads and ensuring the design can scale by adding more cores. This might involve modular design principles or hierarchical NoC structures.

Correct. But, in all practicality, at this point during the project, a lot of time has passed without writing a single line of RTL. Management will ask, “What’s up here?” So, some RTL coding has already happened at this point. Iterations happen fast. Engineers will blame marketing quickly for iterative feature change requests like adding/removing interfaces, changing user bits, Quality of Service (QoS) requirements, address maps, safety needs, buffering, probes, interrupts, modules, etc. All of these can cause significant changes to the RTL. At this point, the consideration has sometimes not started yet that the floorplan can cause more issues because of interface location, blockages, and fences.

Prototyping and Testing: Next, the recommendation is to use FPGA-based prototyping to validate your NoC design in hardware and to test the NoC in the context of the entire system, including the processor cores, memory hierarchy, and peripherals, to identify any issues with the cache coherency mechanism or other components.

True. Emulation and FPGA-based prototyping have become standard steps for most complex designs today. And especially the aspects of cache coherency in the context of the overall system and its software require very long test sequences.

Iterative Design and Feedback: The last recommendation is to use feedback from the simulation, prototyping, and testing phases to refine the NoC design iteratively and benchmark the final design using standard benchmark suites relevant to your target application domain to ensure that it meets the desired performance and efficiency goals.

The cost of “make”

Bar hiring a team of architects with relevant NoC development experience, the first five steps of Understanding Protocols, NoC Architecture Design, Cache Coherency Mechanism Integration, Simulation & Analysis, and Optimization & Scaling will take significant learning time, and writing the implementation specs is far from trivial.

Then, teams will spend most of the effort on RTL development and verification. Just imagine writing RTL protocol adapters for AMBA CHI-E, CHI-B, ACE, ACE-LITE, and AXI – connecting tens of IP blocks coherently – to address coherent and IO coherent use models. Even if you can reuse VIP from EDA vendors to check the protocol correctness, the effort is significant just for unit verification, as you will run thousands of tests.

For the actual interconnect, whether you use a heterogenous, ring, or mesh topology, the effort for development is significant. The logic that deals with directories to enable cache coherency can be complicated. And any change requests require, of course, re-coding!

Finally, when integrating everything in the system context, the effort to validate integration issues, including bring-up in emulation and associate debug, consumes another chunk of effort.

Our customers tell us that when it is all said and done, they would easily spend over 50 person-years just on coherent NoC development for complex designs.

Network-on-chip automation for productivity and configurability.

Automation potential: What to expect from coherent NoC IP

There is a lot of automation potential in the seven steps above!

  • The various relevant protocols can be captured in a library of protocol converters, reducing the need to internalize and implement all the protocols the reused IP blocks speak of. Ideally, they would already be pre-validated with popular IP blocks from leading IP vendors – think providers of Arm and RISC-V ISAs and vendors for interface blocks like LPDDR, PCIe, UCIe, etc., or graphics and AI/ML accelerators.
  • Graphical user interfaces and scripting APIs increase productivity in developing NoC architectures and topologies.
  • Like protocol converters, reusable blocks for directory and cache coherency management can increase development productivity. Their verification is especially critical, so ideally, your vendor has pre-verified them using VIP from EDA vendors and pre-validated the system integration with the ecosystem (think processor providers).
  • The refinement loop is probably the most critical one to optimize. Refinement can be deadly in manual scenarios. Besides reusable building blocks, you should look for configuration tools to automatically create performance models for architectural analysis, export new RTL configurations, and directly connect to digital implementation flows.

The verdict: Make or buy?

The illustration above summarizes some of the automation potential for NoCs. Saving more than 50 person-years is an attractive option compared to developing NoC IP from scratch. Check out what Arteris does in this domain with its Ncore Cache Coherent Interconnect IP and FlexNoC 5 Interconnect IP for non-coherent designs.

The post NoC Development – Make Or Buy? appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Why Chiplets Are So Critical In AutomotiveJohn Koon
    Chiplets are gaining renewed attention in the automotive market, where increasing electrification and intense competition are forcing companies to accelerate their design and production schedules. Electrification has lit a fire under some of the biggest and best-known carmakers, which are struggling to remain competitive in the face of very short market windows and constantly changing requirements. Unlike in the past, when carmakers typically ran on five- to seven-year design cycles, the latest
     

Why Chiplets Are So Critical In Automotive

Od: John Koon
20. Únor 2024 v 09:10

Chiplets are gaining renewed attention in the automotive market, where increasing electrification and intense competition are forcing companies to accelerate their design and production schedules.

Electrification has lit a fire under some of the biggest and best-known carmakers, which are struggling to remain competitive in the face of very short market windows and constantly changing requirements. Unlike in the past, when carmakers typically ran on five- to seven-year design cycles, the latest technology in vehicles today may well be considered dated within several years. And if they cannot keep up, there is a whole new crop of startups producing cheap vehicles with the ability to update or change out features as quickly as a software update.

But software has speed, security, and reliability limitations, and being able to customize the hardware is where many automakers are now putting their efforts. This is where chiplets fit in, and the focus now is on how to build enough interoperability across large ecosystems to make this a plug-and-play market. The key factors to enable automotive chiplet interoperability include standardization, interconnect technologies, communication protocols, power and thermal management, security, testing, and ecosystem collaboration.

Similar to non-automotive applications at the board level, many design efforts are focusing on a die-to-die approach, which is driving a number of novel design considerations and tradeoffs. At the chip level, the interconnects between various processors, chips, memory, and I/O are becoming more complex due to increased design performance requirements, spurring a flurry of standards activities. Different interconnect and interface types have been proposed to serve varying purposes, while emerging chiplet technologies for dedicated functions — processors, memories, and I/Os, to name a few — are changing the approach to chip design.

“There is a realization by automotive OEMs that to control their own destiny, they’re going to have to control their own SoCs,” said David Fritz, vice president of virtual and hybrid systems at Siemens EDA. “However, they don’t understand how far along EDA has come since they were in college in 1982. Also, they believe they need to go to the latest process node, where a mask set is going to cost $100 million. They can’t afford that. They also don’t have access to talent because the talent pool is fairly small. With all that together comes the realization by the OEMs that to control their destiny, they need a technology that’s developed by others, but which can be combined however needed to have a unique differentiated product they are confident is future-proof for at least a few model years. Then it becomes economically viable. The only thing that fits the bill is chiplets.”

Chiplets can be optimized for specific functions, which can help automakers meet reliability, safety, security requirements with technology that has been proven across multiple vehicle designs. In addition, they can shorten time to market and ultimately reduce the cost of different features and functions.

Demand for chips has been on the rise for the past decade. According to Allied Market Research, global automotive chip demand will grow from $49.8 billion in 2021 to $121.3 billion by 2031. That growth will attract even more automotive chip innovation and investment, and chiplets are expected to be a big beneficiary.

But the marketplace for chiplets will take time to mature, and it will likely roll out in phases.  Initially, a vendor will provide different flavors of proprietary dies. Then, partners will work together to supply chiplets to support each other, as has already happened with some vendors. The final stage will be universally interoperable chiplets, as supported by UCIe or some other interconnect scheme.

Getting to the final stage will be the hardest, and it will require significant changes. To ensure interoperability, large enough portions of the automotive ecosystem and supply chain must come together, including hardware and software developers, foundries, OSATs, and material and equipment suppliers.

Momentum is building
On the plus side, not all of this is starting from scratch. At the board level, modules and sub-systems always have used onboard chip-to-chip interfaces, and they will continue to do so. Various chip and IP providers, including Cadence, Diode, Microchip, NXP, Renesas, Rambus, Infineon, Arm, and Synopsys, provide off-the-shelf interface chips or IP to create the interface silicon.

The Universal Chiplet Interconnect Express (UCIe) Consortium is the driving force behind the die-to-die, open interconnect standard. The group released its latest UCIe 1.1 specification in August 2023. Board members include Alibaba, AMD, Arm, ASE, Google Cloud, Intel, Meta, Microsoft, NVIDIA, Qualcomm, Samsung, and others. Industry partners are showing widespread support. AIB and Bunch of Wires (BoW) also have been proposed. In addition, Arm just released its own Chiplet System Architecture, along with an updated AMBA spec to standardize protocols for chiplets.

“Chiplets are already here, driven by necessity,” said Arif Khan, senior product marketing group director for design IP at Cadence. “The growing processor and SoC sizes are hitting the reticle limit and the diseconomies of scale. Incremental gains from process technology advances are lower than rising cost per transistor and design. The advances in packaging technology (2.5D/3D) and interface standardization at a die-to-die level, such as UCIe, will facilitate chiplet development.”

Nearly all of the chiplets used today are developed in-house by big chipmakers such as Intel, AMD, and Marvell, because they can tightly control the characteristics and behavior of those chiplets. But there is work underway at every level to open this market to more players. When that happens, smaller companies can begin capitalizing on what the high-profile trailblazers have accomplished so far, and innovating around those developments.

“Many of us believe the dream of having an off-the-shelf, interoperable chiplet portfolio will likely take years before becoming a reality,” said Guillaume Boillet, senior director strategic marketing at Arteris, adding that interoperability will emerge from groups of partners who are addressing the risk of incomplete specifications.

This also is raising the attractiveness of FPGAs and eFPGAs, which can provide a level of customization and updates for hardware in the field. “Chiplets are a real thing,” said Geoff Tate, CEO of Flex Logix. “Right now, a company building two or more chiplets can operate much more economically than a company building near-reticle-size die with almost no yield. Chiplet standardization still appears to be far away. Even UCIe is not a fixed standard yet. Not all agree on UCIe, bare die testing, and who owns the problem when the integrated package doesn’t work, etc. We do have some customers who use or are evaluating eFPGA for interfaces where standards are in flux like UCIe. They can implement silicon now and use the eFPGA to conform to standards changes later.”

There are other efforts supporting chiplets, as well, although for somewhat different reasons — notably, the rising cost of device scaling and the need to incorporate more features into chips, which are reticle-constrained at the most advanced nodes. But those efforts also pave the way for chiplets in automotive, and there is strong industry backing to make this all work. For example, under the sponsorship of SEMI, ASME, and three IEEE Societies, the new Heterogeneous Integration Roadmap (HIR) looks at various microelectronics design, materials, and packaging issues to come up with a roadmap for the semiconductor industry. Their current focus includes 2.5D, 3D-ICs, wafer-level packaging, integrated photonics, MEMS and sensors, and system-in-package (SiP), aerospace, automotive, and more.

At the recent Heterogeneous Integration Global Summit 2023, representatives from AMD, Applied Materials, ASE, Lam Research, MediaTek, Micron, Onto Innovation, TSMC, and others demonstrated strong support for chiplets. Another group that supports chiplets is the Chiplet Design Exchange (CDX) working group , which is part of the Open Domain Specific Architecture (ODSA) and the Open Compute Project Foundation (OCP). The Chiplet Design Exchange (CDX) charter focuses on the various characteristics of chiplet and chiplet integration, including electrical, mechanical, and thermal design exchange standards of the 2.5D stacked, and 3D Integrated Circuits (3D-ICs). Its representatives include Ansys, Applied Materials, Arm, Ayar Labs, Broadcom, Cadence, Intel, Macom, Marvell, Microsemi, NXP, Siemens EDA, Synopsys, and others.

“The things that automotive companies want in terms of what each chiplet does in terms of functionality is still in an upheaval mode,” Siemens’ Fritz noted. “One extreme has these problems, the other extreme has those problems. This is the sweet spot. This is what’s needed. And these are the types of companies that can go off and do that sort of work, and then you could put them together. Then this interoperability thing is not a big deal. The OEM can make it too complex by saying, ‘I have to handle that whole spectrum of possibilities.’ The alternative is that they could say, ‘It’s just like a high speed PCIe. If I want to communicate from one to the other, I already know how to do that. I’ve got drivers that are running my operating system. That would solve an awful lot of problems, and that’s where I believe it’s going to end up.”

One path to universal chiplet development?

Moving forward, chiplets are a focal point for both the automotive and chip industries, and that will involve everything from chiplet IP to memory interconnects and customization options and limitations.

For example, Renesas Electronics announced in November 2023 plans for its next-generation SoCs and MCUs. The company is targeting all major applications across the automotive digital domain, including advance information about its fifth-generation R-Car SoC for high-performance applications with advanced in-package chiplet integration technology, which is meant to provide automotive engineers greater flexibility to customize their designs.

Renesas noted that if more AI performance is required in Advanced Driver Assistance Systems (ADAS), engineers will have the capability to integrate AI accelerators into a single chip. The company said this roadmap comes after years of collaboration and discussions with Tier 1 and OEM customers, which have been clamoring for a way to accelerate development without compromising quality, including designing and verifying the software even before the hardware is available.

“Due to the ever increasing needs to increase compute on demand, and the increasing need for higher levels of autonomy in the cars of tomorrow, we see challenges in monolithic solutions scaling and providing the performance needs of the market in the upcoming years,” said Vasanth Waran, senior director for SoC Business & Strategies at Renesas. “Chiplets allows for the compute solutions to scale above and beyond the needs of the market.”

Renesas announced plans to create a chiplet-based product family specifically targeted at the automotive market starting in 2025.

Standard interfaces allow for SoC customization
It is not entirely clear how much overlap there will be between standard processors, which is where most chiplets are used today, and chiplets developed for automotive applications. But the underlying technologies and developments certainly will build off each other as this technology shifts into new markets.

“Whether it is an AI accelerator or ADAS automotive application, customers need standard interface IP blocks,” noted David Ridgeway, senior product manager, IP accelerated solutions group at Synopsys. “It is important to provide fully verified IP subsystems around their IP customization requirements to support the subsystem components used in the customers’ SoCs. When I say customization, you might not realize how customizable IP has become over the course of the last 10 to 20 years, on the PHY side as well as the controller side. For example, PCI Express has gone from PCIe Gen 3 to Gen 4 to Gen 5 and now Gen 6. The controller can be configured to support multiple bifurcation modes of smaller link widths, including one x16, two x8, or four x4. Our subsystem IP team works with customers to ensure all the customization requirements are met. For AI applications, signal and power integrity is extremely important to meet their performance requirements. Almost all our customers are seeking to push the envelope to achieve the highest memory bandwidth speeds possible so that their TPU can process many more transactions per second. Whenever the applications are cloud computing or artificial intelligence, customers want the fastest response rate possible.”

Fig 1: IP blocks including processor, digital, PHY, and verification help developers implement the entire SoC. Source: Synopsys

Fig 1: IP blocks including processor, digital, PHY, and verification help developers implement the entire SoC. Source: Synopsys

Optimizing PPA serves the ultimate goal of increasing efficiency, and this makes chiplets particularly attractive in automotive applications. When UCIe matures, it is expected to improve overall performance exponentially. For example, UCIe can deliver a shoreline bandwidth of 28 to 224 GB/s/mm in a standard package, and 165 to 1317 GB/s/mm in an advanced package. This represents a performance improvement of 20- to 100-fold. Bringing latency down from 20ns to 2ns represents a 10-fold improvement. Around 10 times greater power efficiency, at 0.5 pJ/b (standard package) and 0.25 pJ/b (advanced package), is another plus. The key is shortening the interface distance whenever possible.

To optimize chiplet designs, the UCIe Consortium provides some suggestions:

  • Careful planning consideration of architectural cut-lines (i.e. chiplet boundaries), optimizing for power, latency, silicon area, and IP reuse. For example, customizing one chiplet that needs a leading-edge process node while re-using other chiplets on older nodes may impact cost and time.
  • Thermal and mechanical packaging constraints need to be planned out for package thermal envelopes, hot spots, chiplet placements and I/O routing and breakouts.
  • Process nodes need to be carefully selected, particularly in the context of the associated power delivery scheme.
  • Test strategy for chiplets and packaged/assembled parts need to be developed up front to ensure silicon issues are caught at the chiplet-level testing phase rather than after they are assembled into a package.

Conclusion
The idea of standardizing die-to-die interfaces is catching on quickly but the path to get there will take time, effort, and a lot of collaboration among companies that rarely talk with each other. Building a vehicle takes one determine carmaker. Building a vehicle with chiplets requires an entire ecosystem that includes the developers, foundries, OSATs, and material and equipment suppliers to work together.

Automotive OEMs are experts at putting systems together and at finding innovative ways to cut costs. But it remains to seen how quickly and effectively they can build and leverage an ecosystem of interoperable chiplets to shrink design cycles, improve customization, and adapt to a world in which the leading edge technology may be outdated by the time it is fully designed, tested, and available to consumers.

— Ann Mutschler contributed to this report.

Related Reading
Automotive Relationships Shifting With Chiplets
As the automotive ecosystem balances the best approaches for designing in increasingly advanced features, how companies interact is still evolving.

The post Why Chiplets Are So Critical In Automotive appeared first on Semiconductor Engineering.

❌
❌