Ars Technica - All content
AMD signs $4.9 billion deal to challenge Nvidia’s AI infrastructure leadFinancial Times
Enlarge (credit: CFOTO/Future Publishing via Getty Images) AMD has agreed to buy artificial intelligence infrastructure group ZT Systems in a $4.9 billion cash and stock transaction, extending a run of AI investments by the chip company as it seeks to challenge market-leader Nvidia. The California-based group said the acquisition would help accelerate the adoption of its Instinct line of AI data center chips, which compete with Nvidia’s popular graphics processing units (GPUs
19. Srpen 2024 v 22:32

AMD signs $4.9 billion deal to challenge Nvidia’s AI infrastructure lead

Od: Financial Times

19. Srpen 2024 v 22:32

Visitors walk past the AMD booth at the 2024 Mobile World Congress

AMD has agreed to buy artificial intelligence infrastructure group ZT Systems in a $4.9 billion cash and stock transaction, extending a run of AI investments by the chip company as it seeks to challenge market-leader Nvidia.

The California-based group said the acquisition would help accelerate the adoption of its Instinct line of AI data center chips, which compete with Nvidia’s popular graphics processing units (GPUs).

ZT Systems, a private company founded three decades ago, builds custom computing infrastructure for the biggest AI “hyperscalers.” While the company does not disclose its customers, the hyperscalers include the likes of Microsoft, Meta, and Amazon.

Read 15 remaining paragraphs | Comments

Ars Technica - All content
Data centers demand a massive amount of energy. Here’s how some states are tackling the industry’s impact.ProPublica
A Google data center in Douglas County, Georgia. (credit: Google) This article was produced for ProPublica’s Local Reporting Network in partnership with The Seattle Times. Sign up for Dispatches to get stories like this one as soon as they are published. When lawmakers in Washington set out to expand a lucrative tax break for the state’s data center industry in 2022, they included what some considered an essential provision: a study of the energy-hungry industry’s impact on t
4. Srpen 2024 v 13:01

Data centers demand a massive amount of energy. Here’s how some states are tackling the industry’s impact.

Ars Technica - All content

Od: ProPublica

4. Srpen 2024 v 13:01

A Google data center in Douglas County, Georgia.

This article was produced for ProPublica’s Local Reporting Network in partnership with The Seattle Times. Sign up for Dispatches to get stories like this one as soon as they are published.

When lawmakers in Washington set out to expand a lucrative tax break for the state’s data center industry in 2022, they included what some considered an essential provision: a study of the energy-hungry industry’s impact on the state’s electrical grid.

Gov. Jay Inslee vetoed that provision but let the tax break expansion go forward. As The Seattle Times and ProPublica recently reported, the industry has continued to grow and now threatens Washington’s effort to eliminate carbon emissions from electricity generation.

Read 16 remaining paragraphs | Comments

Semiconductor Engineering
Chip Industry Week In ReviewThe SE Staff
BAE Systems and GlobalFoundries are teaming up to strengthen the supply of chips for national security programs, aligning technology roadmaps and collaborating on innovation and manufacturing. Focus areas include advanced packaging, GaN-on-silicon chips, silicon photonics, and advanced technology process development. Onsemi plans to build a $2 billion silicon carbide production plant in the Czech Republic. The site would produce smart power semiconductors for electric vehicles, renewable energy
21. Červen 2024 v 09:01

Chip Industry Week In Review

Semiconductor Engineering

Od: The SE Staff

21. Červen 2024 v 09:01

BAE Systems and GlobalFoundries are teaming up to strengthen the supply of chips for national security programs, aligning technology roadmaps and collaborating on innovation and manufacturing. Focus areas include advanced packaging, GaN-on-silicon chips, silicon photonics, and advanced technology process development.

Onsemi plans to build a $2 billion silicon carbide production plant in the Czech Republic. The site would produce smart power semiconductors for electric vehicles, renewable energy technology, and data centers.

The global chip manufacturing industry is projected to boost capacity by 6% in 2024 and 7% in 2025, reaching 33.7 million 8-inch (200mm) wafers per month, according to SEMI‘s latest World Fab Forecast report. Leading-edge capacity for 5nm nodes and below is expected to grow by 13% in 2024, driven by AI demand for data center applications. Additionally, Intel, Samsung, and TSMC will begin producing 2nm chips using gate-all-around (GAA) FETs next year, boosting leading-edge capacity by 17% in 2025.

At the IEEE Symposium on VLSI Technology & Circuits, imec introduced:

Functional CMOS-based CFETs with stacked bottom and top source/drain contacts.
CMOS-based 56Gb/s zero-IF D-band beamforming transmitters to support next-gen short-range, high-speed wireless services at frequencies above 100GHz.
ADCs for base stations and handsets, a key step toward scalable, high-performance beyond-5G solutions, such as cloud-based AI and extended reality apps.

Quick links to more news:

Global
In-Depth
Market Reports
Education and Training
Security
Product News
Research
Events and Further Reading

Global

Wolfspeed postponed plans to construct a $3 billion chip plant in Germany, underscoring the EU‘s challenges in boosting semiconductor production, reports Reuters. The North Carolina-based company cited reduced capital spending due to a weakened EV market, saying it now aims to start construction in mid-2025, two years later than 0riginally planned.

Micron is building a pilot production line for high-bandwidth memory (HBM) in the U.S., and considering HBM production in Malaysia to meet growing AI demand, according to a Nikkei report. The company is expanding HBM R&D facilities in Boise, Idaho, and eyeing production capacity in Malaysia, while also enhancing its largest HBM facility in Taichung, Taiwan.

Kioxia restored its Yokkaichi and Kitakami plants in Japan to full capacity, ending production cuts as the memory market recovers, according to Nikkei. The company, which is focusing on NAND flash production, has secured new bank credit support, including refinancing a ¥540 billion loan and establishing a ¥210 billion credit line. Kioxia had reduced output by more than 30% in October 2022 due to weak smartphone demand.

Europe’s NATO Innovation Fund announced its first direct investments, which includes semiconductor materials. Twenty-three NATO allies co-invested in this over $1B fund devoted to address critical defense and security challenges.

The second meeting of the U.S.–India Initiative on Critical and Emerging Technology (iCET) was held in New Delhi, with various funding and initiatives announced to support semiconductor technology, next-gen telecommunications, connected and autonomous vehicles, ML, and more.

Amazon announced investments of €10 billion in Germany to drive innovation and support the expansion of its logistics network and cloud infrastructure.

Quantum Machines opened the Israeli Quantum Computing Center (IQCC) research facility, backed by the Israel Innovation Authority and located at Tel Aviv University. Also, Israel-based Classiq is collaborating with NVIDIA and BMW, using quantum computing to find the optimal automotive architecture of electrical and mechanical systems.

Global data center vacancy rates are at historic lows, and power availability is becoming less available, according to a Siemens report featured on Broadband Breakfast. The company called for an influx of financing to find new ways to optimize data center technology and sustainability.

In-Depth

Semiconductor Engineering published its Manufacturing, Packaging & Materials newsletter this week, featuring these top stories:

Single Vs. Multi-Patterning Advancements For EUV
Precise Control Of Copper Plating And CMP
Ruthenium Interconnects On Tap
Opportunities Grow For GPU Acceleration

More reporting this week:

IC Industry’s Growing Role In Sustainability
What’s Missing In Test

Market Reports

Renesas completed its acquisition of Transphorm and will immediately start offering GaN-based power products and reference designs to meet the demand for wide-bandgap (WBG) chips.

Revenues for the top five wafer fab equipment (WFE) companies fell 9% YoY in Q1 2024, according to Counterpoint. This was offset partially by increased demand for NAND and DRAM, which increased 33% YoY, and strong growth in sales to China, which were up 116% YoY.

The SiC power devices industry saw robust growth in 2023, primarily driven by the BEV market, according to TrendForce. The top five suppliers, led by ST with a 32.6% market share and onsemi in second place, accounted for 91.9% of total revenue. However, the anticipated slowdown in BEV sales and weakening industrial demand are expected to significantly decelerate revenue growth in 2024.

About 30% of vehicles produced globally will have E/E architectures with zonal controllers by 2032, according to McKinsey & Co. The market for automotive micro-components and logic semiconductors is predicted to reach $60 billion in 2032, and the overall automotive semiconductor market is expected to grow from $60 billion to $140 billion in the same period, at a 10% CAGR.

The automotive processor market generated US$20 billion in revenue in 2023, according to Yole. US$7.8 billion was from APUs and FPGAs and $12.2 billion was from MCUs. The ADAS and infotainment processors market was worth US$7.8 billion in 2023 and is predicted to grow to $16.4 billion by 2029 at a 13% CAGR. The market for ADAS sensing is expected to grow at a 7% CAGR.

Security

The CHERI Alliance was established to drive adoption of memory safety and scalable software compartmentalization via the security technology CHERI, or Capability Hardware Enhanced RISC Instructions. Founding members include Capabilities Limited, Codasip, the FreeBSD Foundation, lowRISC, SCI Semiconductor, and the University of Cambridge.

In security research:

Japan and China researchers explored a NAND-XOR ring oscillator structure to design an entropy source architecture for a true random number generator (TRNG).
University of Toronto and Carleton University researchers presented a survey examining how hardware is applied to achieve security and how reported attacks have exploited certain defects in hardware.
University of North Texas and Texas Woman’s University researchers explored the potential of hardware security primitive Physical Unclonable Functions (PUF) for mitigation of visual deepfakes.
Villanova University researchers proposed the Boolean DERIVativE attack, which generalizes Boolean domain leakage.

Post-quantum cryptography firm PQShield raised $37 million in Series B funding.

Former OpenAI executive, Ilya Sutskever, who quit over safety concerns, launched Safe Superintelligence Inc. (SSI).

EU industry groups warned the European Commission that its proposed cybersecurity certification scheme (EUCS) for cloud services should not discriminate against Amazon, Google, and Microsoft, reported Reuters.

Cyber Europe tested EU cyber preparedness in the energy sector by simulating a series of large-scale cyber incidents in an exercise organized by the European Union Agency for Cybersecurity (ENISA).

The Cybersecurity and Infrastructure Security Agency (CISA) issued a number of alerts/advisories.

Education and Training

New York non-profit NY CREATES and South Korea’s National Nano Fab Center partnered to develop a hub for joint research, aligned technology services, testbed support, and an engineer exchange program to bolster chips-centered R&D, workforce development, and each nation’s high-tech ecosystem.

New York and the Netherlands agreed on a partnership to promote sustainability within the semiconductor industry, enhance workforce development, and boost semiconductor R&D.

Rapidus is set to send 200 engineers to AI chip developer Tenstorrent in the U.S. for training over the next five years, reports Nikkei. This initiative, led by Japan’s Leading-edge Semiconductor Technology Center (LSTC), aims to bolster Japan’s AI chip industry.

Product News

UMC announced its 22nm embedded high voltage (eHV) technology platform for premium smartphone and mobile device displays. The 22eHV platform reduces core device power consumption by up to 30% compared to previous 28nm processes. Die area is reduced by 10% with the industry’s smallest SRAM bit cells.

Alphawave Semi announced a new 9.2 Gbps HBM3E sub-system silicon platform capable of 1.2 terabytes per second. Based on the HBM3E IP, the sub-system is aimed at addressing the demand for ultra-high-speed connectivity in high-performance compute applications.

Movellus introduced the Aeonic Power product family for on-die voltage regulation, targeting the challenging area of power delivery.

Cadence partnered with Semiwise and sureCore to develop new cryogenic CMOS circuits with possible quantum computing applications. The circuits are based on modified transistors found in the Cadence Spectre Simulation Platform and are capable of processing analog, mixed-signal, and digital circuit simulation and verification at cryogenic temperatures.

Renesas launched R-Car Open Access (RoX), an integrated development platform for software-defined vehicles (SDVs), designed for Renesas R-Car SoCs and MCUs with tools for deployment of AI applications, reducing complexity and saving time and money for car OEMs and Tier 1s.

Infineon released industry-first radiation-hardened 1 and 2 Mb parallel interface ferroelectric-RAM (F-RAM) nonvolatile memory devices, with up to 120 years of data retention at 85-degree Celsius, along with random access and full memory write at bus speeds. Plus, a CoolGaN Transistor 700 V G4 product family for efficient power conversion up to 700 V, ideal for consumer chargers and notebook adapters, data center power supplies, renewable energy inverters, and more.

Ansys adopted NVIDIA’s Omniverse application programming interfaces for its multi-die chip designers. Those APIs will be used for 5G/6G, IoT, AI/ML, cloud computing, and autonomous vehicle applications. The company also announced ConceptEV, an SaaS solution for automotive concept design for EVs.

Fig. 1: Field visualization of 3D-IC with Omniverse. Source: Ansys

QP Technologies announced a new dicing saw for its manufacturing line that can process a full cassette of 300mm wafers 7% faster than existing tools, improving throughput and productivity.

NXP introduced its SAF9xxx of audio DSPs to support the demand for AI-based audio in software-defined vehicles (SDVs) by using Cadence’s Tensilica HiFi 5 DSPs combined with dedicated neural-network engines and hardware-based accelerators.

Avionyx, a provider of software lifecycle engineering in the aerospace and safety-critical systems sector, partnered with Siemens and will leverage its Polarion application lifecycle management (ALM) tool. Also, Dovetail Electric Aviation adopted Siemens Xcelerator to support sustainable aviation.

Research

Researchers from imec and KU Leuven released a +70 page paper “Selecting Alternative Metals for Advanced Interconnects,” addressing interconnect resistance and reliability.

A comprehensive review article — “Future of plasma etching for microelectronics: Challenges and opportunities” — was created by a team of experts from the University of Maryland, Lam Research, IBM, Intel, and many others.

Researchers from the Institut Polytechnique de Paris’s Laboratory of Condensed Matter for Physics developed an approach to investigate defects in semiconductors. The team “determined the spin-dependent electronic structure linked to defects in the arrangement of semiconductor atoms,” the first time this structure has been measured, according to a release.

Lawrence Berkeley National Laboratory-led researchers developed a small enclosed chamber that can hold all the components of an electrochemical reaction, which can be paired with transmission electron microscopy (TEM) to generate precise views of a reaction at atomic scale, and can be frozen to stop the reaction at specific time points. They used the technique to study a copper catalyst.

The Federal Drug Administration (FDA) approved a clinical trial to test a device with 1,024 nanoscale sensors that records brain activity during surgery, developed by engineers at the University of California San Diego (UC San Diego).

Events and Further Reading

Find upcoming chip industry events here, including:

Event	Date	Location
Standards for Chiplet Design with 3DIC Packaging (Part 2)	Jun 21	Online
DAC 2024	Jun 23 – 27	San Francisco
RISC-V Summit Europe 2024	Jun 24 – 28	Munich
Leti Innovation Days 2024	Jun 25 – 27	Grenoble, France
ISCA 2024	Jun 29 – Jul 3	Buenos Aires, Argentina
SEMICON West	Jul 9 – 11	San Francisco
Flash Memory Summit	Aug 6 – 8	Santa Clara, CA
USENIX Security Symposium	Aug 14 – 16	Philadelphia, PA
Hot Chips 2024	Aug 25- 27	Stanford University

Find All Upcoming Events Here

Upcoming webinars are here.

Semiconductor Engineering’s latest newsletters:

Automotive, Security and Pervasive Computing
Systems and Design
Low Power-High Performance
Test, Measurement and Analytics
Manufacturing, Packaging and Materials

The post Chip Industry Week In Review appeared first on Semiconductor Engineering.

IEEE Spectrum
How to Put a Data Center in a ShoeboxAnna Herr
Scientists have predicted that by 2040, almost 50 percent of the world’s electric power will be used in computing. What’s more, this projection was made before the sudden explosion of generative AI. The amount of computing resources used to train the largest AI models has been doubling roughly every 6 months for more than the past decade. At this rate, by 2030 training a single artificial-intelligence model would take one hundred times as much computing resources as the combined annual resourc
15. Květen 2024 v 17:00

How to Put a Data Center in a Shoebox

IEEE Spectrum

Od: Anna Herr

15. Květen 2024 v 17:00

Scientists have predicted that by 2040, almost 50 percent of the world’s electric power will be used in computing. What’s more, this projection was made before the sudden explosion of generative AI. The amount of computing resources used to train the largest AI models has been doubling roughly every 6 months for more than the past decade. At this rate, by 2030 training a single artificial-intelligence model would take one hundred times as much computing resources as the combined annual resources of the current top ten supercomputers. Simply put, computing will require colossal amounts of power, soon exceeding what our planet can provide.

One way to manage the unsustainable energy requirements of the computing sector is to fundamentally change the way we compute. Superconductors could let us do just that.

Superconductors offer the possibility of drastically lowering energy consumption because they do not dissipate energy when passing current. True, superconductors work only at cryogenic temperatures, requiring some cooling overhead. But in exchange, they offer virtually zero-resistance interconnects, digital logic built on ultrashort pulses that require minimal energy, and the capacity for incredible computing density due to easy 3D chip stacking.

Are the advantages enough to overcome the cost of cryogenic cooling? Our work suggests they most certainly are. As the scale of computing resources gets larger, the marginal cost of the cooling overhead gets smaller. Our research shows that starting at around 10 ¹⁶ floating-point operations per second (tens of petaflops) the superconducting computer handily becomes more power efficient than its classical cousin. This is exactly the scale of typical high-performance computers today, so the time for a superconducting supercomputer is now.

At Imec, we have spent the past two years developing superconducting processing units that can be manufactured using standard CMOS tools. A processor based on this work would be one hundred times as energy efficient as the most efficient chips today, and it would lead to a computer that fits a data-center’s worth of computing resources into a system the size of a shoebox.

The Physics of Energy-Efficient Computation

Superconductivity—that superpower that allows certain materials to transmit electricity without resistance at low enough temperatures—was discovered back in 1911, and the idea of using it for computing has been around since the mid-1950s. But despite the promise of lower power usage and higher compute density, the technology couldn’t compete with the astounding advance of CMOS scaling under Moore’s Law. Research has continued through the decades, with a superconducting CPU demonstrated by a group at Yokohama National University as recently as 2020. However, as an aid to computing, superconductivity has stayed largely confined to the laboratory.

To bring this technology out of the lab and toward a scalable design that stands a chance of being competitive in the real world, we had to change our approach here at Imec. Instead of inventing a system from the bottom up—that is, starting with what works in a physics lab and hoping it is useful—we designed it from the top down—starting with the necessary functionality, and working directly with CMOS engineers and a full-stack development team to ensure manufacturability. The team worked not only on a fabrication process, but also software architectures, logic gates, and standard-cell libraries of logic and memory elements to build a complete technology.

The foundational ideas behind energy-efficient computation, however, have been developed as far back as 1991. In conventional processors, much of the power consumed and heat dissipated comes from moving information among logic units, or between logic and memory elements rather than from actual operations. Interconnects made of superconducting material, however, do not dissipate any energy. The wires have zero electrical resistance, and therefore, little energy is required to move bits within the processor. This property of having extremely low energy losses holds true even at very high communication frequencies, where losses would skyrocket ordinary interconnects.

Further energy savings come from the way logic is done inside the superconducting computer. Instead of the transistor, the basic element in superconducting logic is the Josephson-junction.

A Josephson junction is a sandwich—a thin slice of insulating material squeezed between two superconductors. Connect the two superconductors, and you have yourself a Josephson-junction loop.

Under normal conditions, the insulating “meat” in the sandwich is so thin that it does not deter a supercurrent—the whole sandwich just acts as a superconductor. However, if you ramp up the current past a threshold known as a critical current, the superconducting “bread slices” around the insulator get briefly knocked out of their superconducting state. In this transition period, the junction emits a tiny voltage pulse, lasting just a picosecond and dissipating just 2 x 10 ^-20 joules, a hundred-billionth of what it takes to write a single bit of information into conventional flash memory.

Three blue loops, one with nothing inside, one with a red bump and an arrow, and one with a circular arrow. A single flux quantum develops in a Josephson-junction loop via a three-step process. First, a current just above the critical value is passed through the junction. The junction then emits a single-flux-quantum voltage pulse. The voltage pulse passes through the inductor, creating a persistent current in the loop. A Josephson junction is indicated by an x on circuit diagrams. Chris Philpot

The key is that, due to a phenomenon called magnetic flux quantization in the superconducting loop, this pulse is always exactly the same. It is known as a “single flux quantum” (SFQ) of magnetic flux, and it is fixed to have a value of 2.07 millivolt-picoseconds. Put an inductor inside the Josephson-junction loop, and the voltage pulse drives a current. Since the loop is superconducting, this current will continue going around the loop indefinitely, without using any further energy.

Logical operations inside the superconducting computer are made by manipulating these tiny, quantized voltage pulses. A Josephson-junction loop with an SFQ’s worth of persistent current acts as a logical 1, while a current-free loop is a logical 0.

To store information, the Josephson-junction-based version of SRAM in CPU cache, also uses single flux quanta. To store one bit, two Josephson-junction loops need to be placed next to each other. An SFQ with a persistent current in the left-hand loop is a memory element storing a logical 0, whereas no current in the left but a current in the right loop is a logical 1.

A technical illustration of a chip. Designing a superconductor-based data center required full-stack innovation. Imec’s board design contains three main elements: the input and output, leading data to the room temperature world, the conventional DRAM, stacked high and cooled to 77 kelvins, and the superconducting processing units, also stacked, and cooled to 4 K. Inside the superconducting processing unit, basic logic and memory elements are laid out to perform computations. A magnification of the chip shows the basic building blocks: For logic, a Josephson-junction loop without a persistent current indicates a logical 0, while a loop with one single flux quantum’s worth of current represents a logical 1. For memory, two Josephson junction loops are connected together. An SFQ’s worth of persistent current in the left loop is a memory 0, and a current in the right loop is a memory 1. Chris Philpot

Progress Through Full-Stack Development

To go from a lab curiosity to a chip prototype ready for fabrication, we had to innovate the full stack of hardware. This came in three main layers: engineering the basic materials used, circuit development, and architectural design. The three layers had to go together—a new set of materials requires new circuit designs, and new circuit designs require novel architectures to incorporate them. Codevelopment across all three stages, with a strict adherence to CMOS manufacturing capabilities, was the key to success.

At the materials level, we had to step away from the previous lab-favorite superconducting material: niobium. While niobium is easy to model and behaves very well under predictable lab conditions, it is very difficult to scale down. Niobium is sensitive to both process temperature and its surrounding materials, so it is not compatible with standard CMOS processing. Therefore, we switched to the related compound niobium titanium nitride for our basic superconducting material. Niobium titanium nitride can withstand temperatures used in CMOS fabrication without losing its superconducting capabilities, and it reacts much less with its surrounding layers, making it a much more practical choice.

black background with white shape with one black line through it. The basic building block of superconducting logic and memory is the Josephson junction. At Imec, these junctions have been manufactured using a new set of materials, allowing the team to scale down the technology without losing functionality. Here, a tunneling electron microscope image shows a Josephson junction made with alpha-silicon insulator sandwiched between niobium titanium nitride superconductors, achieving a critical dimension of 210 nanometers. Imec

Additionally, we employed a new material for the meat layer of the Josephson-junction sandwich—amorphous, or alpha, silicon. Conventional Josephson-junction materials, most notably aluminum oxide, didn’t scale down well. Aluminum was used because it “wets” the niobium, smoothing the surface, and the oxide was grown in a well-controlled manner. However, to get to the ultrahigh densities that we are targeting, we would have to make the oxide too thin to be practically manufacturable. Alpha silicon, in contrast, allowed us to use a much thicker barrier for the same critical current.

We also had to devise a new way to power the Josephson junctions that would scale down to the size of a chip. Previously, lab-based superconducting computers used transformers to deliver current to their circuit elements. However, having a bulky transformer near each circuit element is unworkable. Instead, we designed a way to deliver power to all the elements on the chip at once by creating a resonant circuit, with specialized capacitors interspersed throughout the chip.

At the circuit level, we had to redesign the entire logic and memory structure to take advantage of the new materials’ capabilities. We designed a novel logic architecture that we call pulse-conserving logic. The key requirement for pulse-conserving logic is that the elements have as many inputs as outputs and that the total number of single flux quanta is conserved. The logic is performed by routing the SFQs through a combination of Josephson-junction loops and inductors to the appropriate outputs, resulting in logical ORs and ANDs. To complement the logic architecture, we also redesigned a compatible Josephson-junction-based SRAM.

Lastly, we had to make architectural innovations to take full advantage of the novel materials and circuit designs. Among these was cooling conventional silicon DRAM down to 77 kelvins and designing a glass bridge between the 77-K section and the main superconducting section. The bridge houses thin wires that allow communication without thermal mixing. We also came up with a way of stacking chips on top of each other and are developing vertical superconducting interconnects to link between circuit boards.

A Data Center the Size of a Shoebox

The result is a superconductor-based chip design that’s optimized for AI processing. A zoom in on one of its boards reveals many similarities with a typical 3D CMOS system-on-chip. The board is populated by computational chips: We call it a superconductor processing unit (SPU), with embedded superconducting SRAM, DRAM memory stacks, and switches, all interconnected on silicon interposer or on glass-bridge advanced packaging technologies.

But there are also some striking differences. First, most of the chip is to be submerged in liquid helium for cooling to a mere 4 K. This includes the SPUs and SRAM, which depend on superconducting logic rather than CMOS, and are housed on an interposer board. Next, there is a glass bridge to a warmer area, a balmy 77 K that hosts the DRAM. The DRAM technology is not superconducting, but conventional silicon cooled down from room temperature, making it more efficient. From there, bespoke connectors lead data to and from the room-temperature world.

An illustration of purple stacked squares with snow on it. Davide Comai

Moore’s law relies on fitting progressively more computing resources into the same space. As scaling down transistors gets more and more difficult, the semiconductor industry is turning toward 3D stacking of chips to keep up the density gains. In classical CMOS-based technology, it is very challenging to stack computational chips on top of each other because of the large amount of power, and therefore heat, that is dissipated within the chips. In superconducting technology, the little power that is dissipated is easily removed by the liquid helium. Logic chips can be directly stacked using advanced 3D integration technologies resulting in shorter and faster connections between the chips, and a smaller footprint.

It is also straightforward to stack multiple boards of 3D superconducting chips on top of each other, leaving only a small space between them. We modeled a stack of 100 such boards, all operating within the same cooling environment and contained in a 20- by 20- by 12-centimeter volume, roughly the size of a shoebox. We calculated that this stack can perform 20 exaflops (in BF16 number format), 20 times the capacity of the largest supercomputer today. What’s more, the system promises to consume only 500 kilowatts of total power. This translates to energy efficiency one hundred times as high as the most efficient supercomputer today.

So far, we’ve scaled down Josephson junctions and interconnect dimensions over three succeeding generations. Going forward, Imec’s road map includes tackling 3D superconducting chip-integration and cooling technologies. For the first generation, the road map envisions the stacking of about 100 boards to obtain the target performance of 20 exaflops. Gradually, more and more logic chips will be stacked, and the number of boards will be reduced. This will further increase performance while reducing complexity and cost.

The Superconducting Vision

We don’t envision that superconducting digital technology will replace conventional CMOS computing, but we do expect it to complement CMOS for specific applications and fuel innovations in new ones. For one, this technology would integrate seamlessly with quantum computers that are also built upon superconducting technology. Perhaps more significantly, we believe it will support the growth in AI and machine learning processing and help provide cloud-based training of big AI models in a much more sustainable way than is currently possible.

In addition, with this technology we can engineer data centers with much smaller footprints. Drastically smaller data centers can be placed close to their target applications, rather than being in some far-off football-stadium-size facility.

Such transformative server technology is a dream for scientists. It opens doors to online training of AI models on real data that are part of an actively changing environment. Take potential robotic farms as an example. Today, training these would be a challenging task, where the required processing capabilities are available only in far-away, power-hungry data centers. With compact, nearby data centers, the data could be processed at once, allowing an AI to learn from current conditions on the farm

Similarly, these miniature data centers can be interspersed in energy grids, learning right away at each node and distributing electricity more efficiently throughout the world. Imagine smart cities, mobile health care systems, manufacturing, farming, and more, all benefiting from instant feedback from adjacent AI learners, optimizing and improving decision making in real time.

This article appears in the June 2024 print issue as “A Data Center in a Shoebox.”

Semiconductor Engineering
Securing AI In The Data CenterBart Stevens
AI has permeated virtually every aspect of our digital lives, from personalized recommendations on streaming platforms to advanced medical diagnostics. Behind the scenes of this AI revolution lies the data center, which houses the hardware, software, and networking infrastructure necessary for training and deploying AI models. Securing AI in the data center relies on data confidentiality, integrity, and authenticity throughout the AI lifecycle, from data preprocessing to model training and infer
9. Květen 2024 v 09:07

Securing AI In The Data Center

Semiconductor Engineering

Od: Bart Stevens

9. Květen 2024 v 09:07

AI has permeated virtually every aspect of our digital lives, from personalized recommendations on streaming platforms to advanced medical diagnostics. Behind the scenes of this AI revolution lies the data center, which houses the hardware, software, and networking infrastructure necessary for training and deploying AI models. Securing AI in the data center relies on data confidentiality, integrity, and authenticity throughout the AI lifecycle, from data preprocessing to model training and inference deployment.

High-value datasets containing sensitive information, such as personal health records or financial transactions, must be shielded from unauthorized access. Robust encryption mechanisms, such as Advanced Encryption Standard (AES), coupled with secure key management practices, form the foundation of data confidentiality in the data center. The encryption key used must be unique and used in a secure environment. Encryption and decryption operations of data are constantly occurring and must be performed to prevent key leakage. Should a compromise arise, it should be possible to renew the key securely and re-encrypt data with the new key.

The encryption key used must also be securely stored in a location that unauthorized processes or individuals cannot access. The keys used must be protected from attempts to read them from the device or attempts to steal them using side-channel techniques such as SCA (Side-Channel Attacks) or FIA (Fault Injection Attacks). The multitenancy aspect of modern data centers calls for robust SCA protection of key data.

Hardware-level security plays a pivotal role in safeguarding AI within the data center, offering built-in protections against a wide range of threats. Trusted Platform Modules (TPMs), secure enclaves, and Hardware Security Modules (HSMs) provide secure storage and processing environments for sensitive data and cryptographic keys, shielding them from unauthorized access or tampering. By leveraging hardware-based security features, organizations can enhance the resilience of their AI infrastructure and mitigate the risk of attacks targeting software vulnerabilities.

Ideally, secure cryptographic processing is handled by a Root of Trust core. The AI service provider manages the Root of Trust firmware, but it can also load secure applications that customers can write to implement their own cryptographic key management and storage applications. The Root of Trust can be integrated in the host CPU that orchestrates the AI operations, decrypting the AI model and its specific parameters before those are fed to AI or network accelerators (GPUs or NPUs). It can also be directly integrated with the GPUs and NPUs to perform encryption/decryption at that level. These GPUs and NPUs may also select to store AI workloads and inference models in encrypted form in their local memory banks and decrypt the data on the fly when access is required. Dedicated on-the-fly, low latency in-line memory decryption engines based on the AES-XTS algorithm can keep up with the memory bandwidth, ensuring that the process is not slowed down.

AI training workloads are often distributed among dozens of devices connected via PCIe or high-speed networking technology such as 800G Ethernet. An efficient confidentiality and integrity protocol such as MACsec using the AES-GCM algorithm can protect the data in motion over high-speed Ethernet links. AES-GCM engines integrated with the server SoC and the PCIe acceleration boards ensure that traffic is authenticated and optionally encrypted.

Rambus offers a broad portfolio of security IP covering the key security elements needed to protect AI in the data center. Rambus Root of Trust IP cores ensure a secure boot protocol that protects the integrity of its firmware. This can be combined with Rambus inline memory encryption engines, as well as dedicated solutions for MACsec up to 800G.

Resources

The post Securing AI In The Data Center appeared first on Semiconductor Engineering.

Semiconductor Engineering
Securing AI In The Data CenterBart Stevens
AI has permeated virtually every aspect of our digital lives, from personalized recommendations on streaming platforms to advanced medical diagnostics. Behind the scenes of this AI revolution lies the data center, which houses the hardware, software, and networking infrastructure necessary for training and deploying AI models. Securing AI in the data center relies on data confidentiality, integrity, and authenticity throughout the AI lifecycle, from data preprocessing to model training and infer
9. Květen 2024 v 09:07

Securing AI In The Data Center

Semiconductor Engineering

Od: Bart Stevens

9. Květen 2024 v 09:07

AI has permeated virtually every aspect of our digital lives, from personalized recommendations on streaming platforms to advanced medical diagnostics. Behind the scenes of this AI revolution lies the data center, which houses the hardware, software, and networking infrastructure necessary for training and deploying AI models. Securing AI in the data center relies on data confidentiality, integrity, and authenticity throughout the AI lifecycle, from data preprocessing to model training and inference deployment.

High-value datasets containing sensitive information, such as personal health records or financial transactions, must be shielded from unauthorized access. Robust encryption mechanisms, such as Advanced Encryption Standard (AES), coupled with secure key management practices, form the foundation of data confidentiality in the data center. The encryption key used must be unique and used in a secure environment. Encryption and decryption operations of data are constantly occurring and must be performed to prevent key leakage. Should a compromise arise, it should be possible to renew the key securely and re-encrypt data with the new key.

The encryption key used must also be securely stored in a location that unauthorized processes or individuals cannot access. The keys used must be protected from attempts to read them from the device or attempts to steal them using side-channel techniques such as SCA (Side-Channel Attacks) or FIA (Fault Injection Attacks). The multitenancy aspect of modern data centers calls for robust SCA protection of key data.

Hardware-level security plays a pivotal role in safeguarding AI within the data center, offering built-in protections against a wide range of threats. Trusted Platform Modules (TPMs), secure enclaves, and Hardware Security Modules (HSMs) provide secure storage and processing environments for sensitive data and cryptographic keys, shielding them from unauthorized access or tampering. By leveraging hardware-based security features, organizations can enhance the resilience of their AI infrastructure and mitigate the risk of attacks targeting software vulnerabilities.

Ideally, secure cryptographic processing is handled by a Root of Trust core. The AI service provider manages the Root of Trust firmware, but it can also load secure applications that customers can write to implement their own cryptographic key management and storage applications. The Root of Trust can be integrated in the host CPU that orchestrates the AI operations, decrypting the AI model and its specific parameters before those are fed to AI or network accelerators (GPUs or NPUs). It can also be directly integrated with the GPUs and NPUs to perform encryption/decryption at that level. These GPUs and NPUs may also select to store AI workloads and inference models in encrypted form in their local memory banks and decrypt the data on the fly when access is required. Dedicated on-the-fly, low latency in-line memory decryption engines based on the AES-XTS algorithm can keep up with the memory bandwidth, ensuring that the process is not slowed down.

AI training workloads are often distributed among dozens of devices connected via PCIe or high-speed networking technology such as 800G Ethernet. An efficient confidentiality and integrity protocol such as MACsec using the AES-GCM algorithm can protect the data in motion over high-speed Ethernet links. AES-GCM engines integrated with the server SoC and the PCIe acceleration boards ensure that traffic is authenticated and optionally encrypted.

Rambus offers a broad portfolio of security IP covering the key security elements needed to protect AI in the data center. Rambus Root of Trust IP cores ensure a secure boot protocol that protects the integrity of its firmware. This can be combined with Rambus inline memory encryption engines, as well as dedicated solutions for MACsec up to 800G.

Resources

The post Securing AI In The Data Center appeared first on Semiconductor Engineering.

Normální zobrazení

Global

In-Depth

Market Reports

Security

Education and Training

Product News

Research

Events and Further Reading

The Physics of Energy-Efficient Computation

Progress Through Full-Stack Development

A Data Center the Size of a Shoebox

The Superconducting Vision