FreshRSS

Zobrazení pro čtení

Jsou dostupné nové články, klikněte pro obnovení stránky.

NIST Announces Post-Quantum Cryptography Standards



Today, almost all data on the Internet, including bank transactions, medical records, and secure chats, is protected with an encryption scheme called RSA (named after its creators Rivest, Shamir, and Adleman). This scheme is based on a simple fact—it is virtually impossible to calculate the prime factors of a large number in a reasonable amount of time, even on the world’s most powerful supercomputer. Unfortunately, large quantum computers, if and when they are built, would find this task a breeze, thus undermining the security of the entire Internet.

Luckily, quantum computers are only better than classical ones at a select class of problems, and there are plenty of encryption schemes where quantum computers don’t offer any advantage. Today, the U.S. National Institute of Standards and Technology (NIST) announced the standardization of three post-quantum cryptography encryption schemes. With these standards in hand, NIST is encouraging computer system administrators to begin transitioning to post-quantum security as soon as possible.

“Now our task is to replace the protocol in every device, which is not an easy task.” —Lily Chen, NIST

These standards are likely to be a big element of the Internet’s future. NIST’s previous cryptography standards, developed in the 1970s, are used in almost all devices, including Internet routers, phones, and laptops, says Lily Chen, head of the cryptography group at NIST who lead the standardization process. But adoption will not happen overnight.

“Today, public key cryptography is used everywhere in every device,” Chen says. “Now our task is to replace the protocol in every device, which is not an easy task.”

Why we need post-quantum cryptography now

Most experts believe large-scale quantum computers won’t be built for at least another decade. So why is NIST worried about this now? There are two main reasons.

First, many devices that use RSA security, like cars and some IoT devices, are expected to remain in use for at least a decade. So they need to be equipped with quantum-safe cryptography before they are released into the field.

“For us, it’s not an option to just wait and see what happens. We want to be ready and implement solutions as soon as possible.” —Richard Marty, LGT Financial Services

Second, a nefarious individual could potentially download and store encrypted data today, and decrypt it once a large enough quantum computer comes online. This concept is called “harvest now, decrypt later“ and by its nature, it poses a threat to sensitive data now, even if that data can only be cracked in the future.

Security experts in various industries are starting to take the threat of quantum computers seriously, says Joost Renes, principal security architect and cryptographer at NXP Semiconductors. “Back in 2017, 2018, people would ask ‘What’s a quantum computer?’” Renes says. “Now, they’re asking ‘When will the PQC standards come out and which one should we implement?’”

Richard Marty, chief technology officer at LGT Financial Services, agrees. “For us, it’s not an option to just wait and see what happens. We want to be ready and implement solutions as soon as possible, to avoid harvest now and decrypt later.”

NIST’s competition for the best quantum-safe algorithm

NIST announced a public competition for the best PQC algorithm back in 2016. They received a whopping 82 submissions from teams in 25 different countries. Since then, NIST has gone through 4 elimination rounds, finally whittling the pool down to four algorithms in 2022.

This lengthy process was a community-wide effort, with NIST taking input from the cryptographic research community, industry, and government stakeholders. “Industry has provided very valuable feedback,” says NIST’s Chen.

These four winning algorithms had intense-sounding names: CRYSTALS-Kyber, CRYSTALS-Dilithium, Sphincs+, and FALCON. Sadly, the names did not survive standardization: The algorithms are now known as Federal Information Processing Standard (FIPS) 203 through 206. FIPS 203, 204, and 205 are the focus of today’s announcement from NIST. FIPS 206, the algorithm previously known as FALCON, is expected to be standardized in late 2024.

The algorithms fall into two categories: general encryption, used to protect information transferred via a public network, and digital signature, used to authenticate individuals. Digital signatures are essential for preventing malware attacks, says Chen.

Every cryptography protocol is based on a math problem that’s hard to solve but easy to check once you have the correct answer. For RSA, it’s factoring large numbers into two primes—it’s hard to figure out what those two primes are (for a classical computer), but once you have one it’s straightforward to divide and get the other.

“We have a few instances of [PQC], but for a full transition, I couldn’t give you a number, but there’s a lot to do.” —Richard Marty, LGT Financial Services

Two out of the three schemes already standardized by NIST, FIPS 203 and FIPS 204 (as well as the upcoming FIPS 206), are based on another hard problem, called lattice cryptography. Lattice cryptography rests on the tricky problem of finding the lowest common multiple among a set of numbers. Usually, this is implemented in many dimensions, or on a lattice, where the least common multiple is a vector.

The third standardized scheme, FIPS 205, is based on hash functions—in other words, converting a message to an encrypted string that’s difficult to reverse

The standards include the encryption algorithms’ computer code, instructions for how to implement it, and intended uses. There are three levels of security for each protocol, designed to future-proof the standards in case some weaknesses or vulnerabilities are found in the algorithms.

Lattice cryptography survives alarms over vulnerabilities

Earlier this year, a pre-print published to the arXiv alarmed the PQC community. The paper, authored by Yilei Chen of Tsinghua University in Beijing, claimed to show that lattice-based cryptography, the basis of two out of the three NIST protocols, was not, in fact, immune to quantum attacks. On further inspection, Yilei Chen’s argument turned out to have a flaw—and lattice cryptography is still believed to be secure against quantum attacks.

On the one hand, this incident highlights the central problem at the heart of all cryptography schemes: There is no proof that any of the math problems the schemes are based on are actually “hard.” The only proof, even for the standard RSA algorithms, is that people have been trying to break the encryption for a long time, and have all failed. Since post-quantum cryptography standards, including lattice cryptogrphay, are newer, there is less certainty that no one will find a way to break them.

That said, the failure of this latest attempt only builds on the algorithm’s credibility. The flaw in the paper’s argument was discovered within a week, signaling that there is an active community of experts working on this problem. “The result of that paper is not valid, that means the pedigree of the lattice-based cryptography is still secure,” says NIST’s Lily Chen (no relation to Tsinghua University’s Yilei Chen). “People have tried hard to break this algorithm. A lot of people are trying, they try very hard, and this actually gives us confidence.”

NIST’s announcement is exciting, but the work of transitioning all devices to the new standards has only just begun. It is going to take time, and money, to fully protect the world from the threat of future quantum computers.

“We’ve spent 18 months on the transition and spent about half a million dollars on it,” says Marty of LGT Financial Services. “We have a few instances of [PQC], but for a full transition, I couldn’t give you a number, but there’s a lot to do.”

The Saga of AD-X2, the Battery Additive That Roiled the NBS



Senate hearings, a post office ban, the resignation of the director of the National Bureau of Standards, and his reinstatement after more than 400 scientists threatened to resign. Who knew a little box of salt could stir up such drama?

What was AD-X2?

It all started in 1947 when a bulldozer operator with a 6th grade education, Jess M. Ritchie, teamed up with UC Berkeley chemistry professor Merle Randall to promote AD-X2, an additive to extend the life of lead-acid batteries. The problem of these rechargeable batteries’ dwindling capacity was well known. If AD-X2 worked as advertised, millions of car owners would save money.

Black and white photo of a man in a suit holding an object in his hands and talking. Jess M. Ritchie demonstrates his AD-X2 battery additive before the Senate Select Committee on Small Business.National Institute of Standards and Technology Digital Collections

A basic lead-acid battery has two electrodes, one of lead and the other of lead dioxide, immersed in dilute sulfuric acid. When power is drawn from the battery, the chemical reaction splits the acid molecules, and lead sulfate is deposited in the solution. When the battery is charged, the chemical process reverses, returning the electrodes to their original state—almost. Each time the cell is discharged, the lead sulfate “hardens” and less of it can dissolve in the sulfuric acid. Over time, it flakes off, and the battery loses capacity until it’s dead.

By the 1930s, so many companies had come up with battery additives that the U.S. National Bureau of Standards stepped in. Its lab tests revealed that most were variations of salt mixtures, such as sodium and magnesium sulfates. Although the additives might help the battery charge faster, they didn’t extend battery life. In May 1931, NBS (now the National Institute of Standards and Technology, or NIST) summarized its findings in Letter Circular No. 302: “No case has been found in which this fundamental reaction is materially altered by the use of these battery compounds and solutions.”

Of course, innovation never stops. Entrepreneurs kept bringing new battery additives to market, and the NBS kept testing them and finding them ineffective.

Do battery additives work?

After World War II, the National Better Business Bureau decided to update its own publication on battery additives, “Battery Compounds and Solutions.” The publication included a March 1949 letter from NBS director Edward Condon, reiterating the NBS position on additives. Prior to heading NBS, Condon, a physicist, had been associate director of research at Westinghouse Electric in Pittsburgh and a consultant to the National Defense Research Committee. He helped set up MIT’s Radiation Laboratory, and he was also briefly part of the Manhattan Project. Needless to say, Condon was familiar with standard practices for research and testing.

Meanwhile, Ritchie claimed that AD-X2’s secret formula set it apart from the hundreds of other additives on the market. He convinced his senator, William Knowland, a Republican from Oakland, Calif., to write to NBS and request that AD-X2 be tested. NBS declined, not out of any prejudice or ill will, but because it tested products only at the request of other government agencies. The bureau also had a longstanding policy of not naming the brands it tested and not allowing its findings to be used in advertisements.

Photo of a product box with directions printed on it. AD-X2 consisted mainly of Epsom salt and Glauber’s salt.National Institute of Standards and Technology Digital Collections

Ritchie cried foul, claiming that NBS was keeping new businesses from entering the marketplace. Merle Randall launched an aggressive correspondence with Condon and George W. Vinal, chief of NBS’s electrochemistry section, extolling AD-X2 and the testimonials of many users. In its responses, NBS patiently pointed out the difference between anecdotal evidence and rigorous lab testing.

Enter the Federal Trade Commission. The FTC had received a complaint from the National Better Business Bureau, which suspected that Pioneers, Inc.—Randall and Ritchie’s distribution company—was making false advertising claims. On 22 March 1950, the FTC formally asked NBS to test AD-X2.

By then, NBS had already extensively tested the additive. A chemical analysis revealed that it was 46.6 percent magnesium sulfate (Epsom salt) and 49.2 percent sodium sulfate (Glauber’s salt, a horse laxative) with the remainder being water of hydration (water that’s been chemically treated to form a hydrate). That is, AD-X2 was similar in composition to every other additive on the market. But, because of its policy of not disclosing which brands it tests, NBS didn’t immediately announce what it had learned.

The David and Goliath of battery additives

NBS then did something unusual: It agreed to ignore its own policy and let the National Better Business Bureau include the results of its AD-X2 tests in a public statement, which was published in August 1950. The NBBB allowed Pioneers to include a dissenting comment: “These tests were not run in accordance with our specification and therefore did not indicate the value to be derived from our product.”

Far from being cowed by the NBBB’s statement, Ritchie was energized, and his story was taken up by the mainstream media. Newsweek’s coverage pitted an up-from-your-bootstraps David against an overreaching governmental Goliath. Trade publications, such as Western Construction News and Batteryman, also published flattering stories about Pioneers. AD-X2 sales soared.

Then, in January 1951, NBS released its updated pamphlet on battery additives, Circular 504. Once again, tests by the NBS found no difference in performance between batteries treated with additives and the untreated control group. The Government Printing Office sold the circular for 15 cents, and it was one of NBS’s most popular publications. AD-X2 sales plummeted.

Ritchie needed a new arena in which to challenge NBS. He turned to politics. He called on all of his distributors to write to their senators. Between July and December 1951, 28 U.S. senators and one U.S. representative wrote to NBS on behalf of Pioneers.

Condon was losing his ability to effectively represent the Bureau. Although the Senate had confirmed Condon’s nomination as director without opposition in 1945, he was under investigation by the House Committee on Un-American Activities for several years. FBI Director J. Edgar Hoover suspected Condon to be a Soviet spy. (To be fair, Hoover suspected the same of many people.) Condon was repeatedly cleared and had the public backing of many prominent scientists.

But Condon felt the investigations were becoming too much of a distraction, and so he resigned on 10 August 1951. Allen V. Astin became acting director, and then permanent director the following year. And he inherited the AD-X2 mess.

Astin had been with NBS since 1930. Originally working in the electronics division, he developed radio telemetry techniques, and he designed instruments to study dielectric materials and measurements. During World War II, he shifted to military R&D, most notably development of the proximity fuse, which detonates an explosive device as it approaches a target. I don’t think that work prepared him for the political bombs that Ritchie and his supporters kept lobbing at him.

Mr. Ritchie almost goes to Washington

On 6 September 1951, another government agency entered the fray. C.C. Garner, chief inspector of the U.S. Post Office Department, wrote to Astin requesting yet another test of AD-X2. NBS dutifully submitted a report that the additive had “no beneficial effects on the performance of lead acid batteries.” The post office then charged Pioneers with mail fraud, and Ritchie was ordered to appear at a hearing in Washington, D.C., on 6 April 1952. More tests were ordered, and the hearing was delayed for months.

Back in March 1950, Ritchie had lost his biggest champion when Merle Randall died. In preparation for the hearing, Ritchie hired another scientist: Keith J. Laidler, an assistant professor of chemistry at the Catholic University of America. Laidler wrote a critique of Circular 504, questioning NBS’s objectivity and testing protocols.

Ritchie also got Harold Weber, a professor of chemical engineering at MIT, to agree to test AD-X2 and to work as an unpaid consultant to the Senate Select Committee on Small Business.

Life was about to get more complicated for Astin and NBS.

Why did the NBS Director resign?

Trying to put an end to the Pioneers affair, Astin agreed in the spring of 1952 that NBS would conduct a public test of AD-X2 according to terms set by Ritchie. Once again, the bureau concluded that the battery additive had no beneficial effect.

However, NBS deviated slightly from the agreed-upon parameters for the test. Although the bureau had a good scientific reason for the minor change, Ritchie had a predictably overblown reaction—NBS cheated!

Then, on 18 December 1952, the Senate Select Committee on Small Business—for which Ritchie’s ally Harold Weber was consulting—issued a press release summarizing the results from the MIT tests: AD-X2 worked! The results “demonstrate beyond a reasonable doubt that this material is in fact valuable, and give complete support to the claims of the manufacturer.” NBS was “simply psychologically incapable of giving Battery AD-X2 a fair trial.”

Black and white photo of a man standing next to a row of lead-acid batteries. The National Bureau of Standards’ regular tests of battery additives found that the products did not work as claimed.National Institute of Standards and Technology Digital Collections

But the press release distorted the MIT results.The MIT tests had focused on diluted solutions and slow charging rates, not the normal use conditions for automobiles, and even then AD-X2’s impact was marginal. Once NBS scientists got their hands on the report, they identified the flaws in the testing.

How did the AD-X2 controversy end?

The post office finally got around to holding its mail fraud hearing in the fall of 1952. Ritchie failed to attend in person and didn’t realize his reports would not be read into the record without him, which meant the hearing was decidedly one-sided in favor of NBS. On 27 February 1953, the Post Office Department issued a mail fraud alert. All of Pioneers’ mail would be stopped and returned to sender stamped “fraudulent.” If this charge stuck, Ritchie’s business would crumble.

But something else happened during the fall of 1952: Dwight D. Eisenhower, running on a pro-business platform, was elected U.S. president in a landslide.

Ritchie found a sympathetic ear in Eisenhower’s newly appointed Secretary of Commerce Sinclair Weeks, who acted decisively. The mail fraud alert had been issued on a Friday. Over the weekend, Weeks had a letter hand-delivered to Postmaster General Arthur Summerfield, another Eisenhower appointee. By Monday, the fraud alert had been suspended.

What’s more, Weeks found that Astin was “not sufficiently objective” and lacked a “business point of view,” and so he asked for Astin’s resignation on 24 March 1953. Astin complied. Perhaps Weeks thought this would be a mundane dismissal, just one of the thousands of political appointments that change hands with every new administration. That was not the case.

More than 400 NBS scientists—over 10 percent of the bureau’s technical staff— threatened to resign in protest. The American Academy for the Advancement of Science also backed Astin and NBS. In an editorial published in Science, the AAAS called the battery additive controversy itself “minor.” “The important issue is the fact that the independence of the scientist in his findings has been challenged, that a gross injustice has been done, and that scientific work in the government has been placed in jeopardy,” the editorial stated.

Two black and white portrait photos of men in suits. National Bureau of Standards director Edward Condon [left] resigned in 1951 because investigations into his political beliefs were impeding his ability to represent the bureau. Incoming director Allen V. Astin [right] inherited the AD-X2 controversy, which eventually led to Astin’s dismissal and then his reinstatement after a large-scale protest by NBS researchers and others. National Institute of Standards and Technology Digital Collections

Clearly, AD-X2’s effectiveness was no longer the central issue. The controversy was a stand-in for a larger debate concerning the role of government in supporting small business, the use of science in making policy decisions, and the independence of researchers. Over the previous few years, highly respected scientists, including Edward Condon and J. Robert Oppenheimer, had been repeatedly investigated for their political beliefs. The request for Astin’s resignation was yet another government incursion into scientific freedom.

Weeks, realizing his mistake, temporarily reinstated Astin on 17 April 1953, the day the resignation was supposed to take effect. He also asked the National Academy of Sciences to test AD-X2 in both the lab and the field. By the time the academy’s report came out in October 1953, Weeks had permanently reinstated Astin. The report, unsurprisingly, concluded that NBS was correct: AD-X2 had no merit. Science had won.

NIST makes a movie

On 9 December 2023, NIST released the 20-minute docudrama The AD-X2 Controversy. The film won the Best True Story Narrative and Best of Festival at the 2023 NewsFest Film Festival. I recommend taking the time to watch it.

The AD-X2 Controversy www.youtube.com

Many of the actors are NIST staff and scientists, and they really get into their roles. Much of the dialogue comes verbatim from primary sources, including congressional hearings and contemporary newspaper accounts.

Despite being an in-house production, NIST’s film has a Hollywood connection. The film features brief interviews with actors John and Sean Astin (of Lord of The Rings and Stranger Things fame)—NBS director Astin’s son and grandson.

The AD-X2 controversy is just as relevant today as it was 70 years ago. Scientific research, business interests, and politics remain deeply entangled. If the public is to have faith in science, it must have faith in the integrity of scientists and the scientific method. I have no objection to science being challenged—that’s how science moves forward—but we have to make sure that neither profit nor politics is tipping the scales.

Part of a continuing series looking at historical artifacts that embrace the boundless potential of technology.

An abridged version of this article appears in the August 2024 print issue as “The AD-X2 Affair.”

References


I first heard about AD-X2 after my IEEE Spectrum editor sent me a notice about NIST’s short docudrama The AD-X2 Controversy, which you can, and should, stream online. NIST held a colloquium on 31 July 2018 with John Astin and his brother Alexander (Sandy), where they recalled what it was like to be college students when their father’s reputation was on the line. The agency has also compiled a wonderful list of resources, including many of the primary source government documents.

The AD-X2 controversy played out in the popular media, and I read dozens of articles following the almost daily twists and turns in the case in the New York Times, Washington Post, and Science.

I found Elio Passaglia’s A Unique Institution: The National Bureau of Standards 1950-1969 to be particularly helpful. The AD-X2 controversy is covered in detail in Chapter 2: Testing Can Be Troublesome.

A number of graduate theses have been written about AD-X2. One I consulted was Samuel Lawrence’s 1958 thesis “The Battery AD-X2 Controversy: A Study of Federal Regulation of Deceptive Business Practices.” Lawrence also published the 1962 book The Battery Additive Controversy.


The Sneaky Standard



A version of this post originally appeared on Tedium, Ernie Smith’s newsletter, which hunts for the end of the long tail.

Personal computing has changed a lot in the past four decades, and one of the biggest changes, perhaps the most unheralded, comes down to compatibility. These days, you generally can’t fry a computer by plugging in a joystick that the computer doesn’t support. Simply put, standardization slowly fixed this. One of the best examples of a bedrock standard is the peripheral component interconnect, or PCI, which came about in the early 1990s and appeared in some of the decade’s earliest consumer machines three decades ago this year. To this day, PCI slots are used to connect network cards, sound cards, disc controllers, and other peripherals to computer motherboards via a bus that carries data and control signals. PCI’s lessons gradually shaped other standards, like USB, and ultimately made computers less frustrating. So how did we get it? Through a moment of canny deception.

Commercial - Intel Inside Pentium Processor (1994) www.youtube.com

Embracing standards: the computing industry’s gift to itself

In the 1980s, when you used the likes of an Apple II or a Commodore 64 or an MS-DOS machine, you were essentially locked into an ecosystem. Floppy disks often weren’t compatible. The peripherals didn’t work across platforms. If you wanted to sell hardware in the 1980s, you were stuck building multiple versions of the same device.

For example, the KoalaPad was a common drawing tool sold in the early 1980s for numerous platforms, including the Atari 800, the Apple II, the TRS-80, the Commodore 64, and the IBM PC. It was essentially the same device on every platform, and yet, KoalaPad’s manufacturer, Koala Technologies, had to make five different versions of this device, with five different manufacturing processes, five different connectors, five different software packages, and a lot of overhead. It was wasteful, made being a hardware manufacturer more costly, and added to consumer confusion.

Drawing on a 1983 KoalaPad (Apple IIe) www.youtube.com

This slowly began to change in around 1982, when the market of IBM PC clones started taking off. It was a happy accident—IBM’s decision to use a bunch of off-the-shelf components for its PC accidentally turned them into a de facto standard. Gradually, it became harder for computing platforms to become islands unto themselves. Even when IBM itself tried and failed to sell the computing world on a bunch of proprietary standards in its PS/2 line, it didn’t work. The cat was already out of the bag. It was too late.

So how did we end up with the standards that we have today, and the PCI expansion card standard specifically? PCI wasn’t the only game in town—you could argue, for example, that if things played out differently, we’d all be using NuBus or Micro Channel architecture. But it was a standard seemingly for the long haul, far beyond other competing standards of its era.

Who’s responsible for spearheading this standard? Intel. While PCI was a cross-platform technology, it proved to be an important strategy for the chipmaker to consolidate its power over the PC market at a time when IBM had taken its foot off the gas, choosing to focus on its own PowerPC architecture and narrower plays like the ThinkPad instead, and was no longer shaping the architecture of the PC.

The vision of PCI was simple: an interconnect standard that was not intended to be limited to one line of processors or one bus. But don’t mistake standardization for cooperation. PCI was a chess piece—a part of a different game than the one PC manufacturers were playing.

Close up of a board showing several black raised PCIe interconnects. The PCI standard and its derivatives have endured for over three decades. Modern computers with a GPU often use a PCIe interconnect. Alamy

In the early 1990s, Intel needed a win

In the years before Intel’s Pentium chipset came out in 1993, there seemed to be some skepticism about whether Intel could maintain its status at the forefront of the desktop-computing field.

In lower-end consumer machines, players like Advanced Micro Devices (AMD) and Cyrix were starting to shake their weight around. At the high end of the professional market, workstation-level computing from the likes of Sun Microsystems, Silicon Graphics, and Digital Equipment Corporation suggested there wasn’t room for Intel in the long run. And laterally, the company suddenly found itself competing with a triple threat of IBM, Motorola, and Apple, whose PowerPC chip was about to hit the market.

A Bloomberg piece from the period painted Intel as being boxed in between these various extremes:

If its rivals keep gaining, Intel could eventually lose ground all around.

This is no idle threat. Cyrix Corp. and Chips & Technologies Inc. have re-created—and improved—Intel’s 386 without, they say, violating copyrights or patents. AMD has at least temporarily won the right in court to make 386 clones under a licensing deal that Intel canceled in 1985. In the past 12 months, AMD has won 40% of a market that since 1985 has given Intel $2 billion in profits and a $2.3 billion cash hoard. The 486 may suffer next. Intel has been cutting its prices faster than for any new chip in its history. And in mid-May, it chopped 50% more from one model after Cyrix announced a chip with some similar features. Although the average price of a 486 is still four times that of a 386, analysts say Intel’s profits may grow less than 5% this year, to about $850 million.

Intel’s chips face another challenge, too. Ebbing demand for personal computers has slowed innovation in advanced PCs. This has left a gap at the top—and most profitable—end of the desktop market that Sun, Hewlett-Packard Co., and other makers of powerful workstations are working to fill. Thanks to microprocessors based on a technology known as RISC, or reduced instruction-set computing, workstations have dazzling graphics and more oomph—handy for doing complex tasks and moving data faster over networks. And some are as cheap as high-end PCs. So the workstation makers are now making inroads among such PC buyers as stock traders, banks, and airlines.

This was a deep underestimation of Intel’s market position, it turned out. The company was actually well-positioned to shape the direction of the industry through standardization. They had a direct say on what appeared on the motherboards of millions of computers, and that gave them impressive power to wield. If Intel didn’t want to support a given standard, that standard would likely be dead in the water.

How Intel crushed a standards body on the way to giving us an essential technology

The Video Electronics Standards Association, or VESA, is perhaps best known today for its mounting system for computer monitors and its DisplayPort technology. But in the early 1990s, it was working on a video-focused successor to the Industry Standard Architecture (ISA) internal bus, widely used in IBM PC clones.

A bus, the physical wiring that lets a CPU talk to internal and external peripheral devices, is something of a bedrock of computing—and in the wrong setting, a bottleneck. The ISA expansion card slot, which had become a de facto standard in the 1980s, had given the IBM PC clone market something to build against during its first decade. But by the early 1990s, for high-bandwidth applications, particularly video, it was holding back innovation. It just wasn’t fast enough to keep up, even after it had been upgraded from being able to handle 8 bits of data at once to 16.

That’s where the VESA Local Bus (VL-Bus) came into play. Built to work only with video cards, the standard offered a faster connection, and could handle 32 bits of data. It was targeted at the Super VGA standard, which offered higher resolution (up to 1280 x 1024 pixels) and richer colors at a time when Windows was finally starting to take hold in the market. To overcome the limitations of the ISA bus, graphics card and motherboard manufacturers started collaborating on proprietary interfaces, creating an array of incompatible graphics buses. The lack of a consistent experience around Super VGA led to VESA’s formation. The new VESA slot, which extended the existing 16-bit ISA bus with an additional 32-bit video-specific connector, was an attempt to fix that.

It wasn’t a massive leap—more like a stopgap improvement on the way to better graphics.

And it looked like Intel was going to go for the VL-BUS. But there was one problem—Intel actually wasn’t feeling it, and Intel didn’t exactly make that point clear to the companies supporting the VESA standards body until it was too late for them to react.

Intel revealed its hand in an interesting way, according to The San Francisco Examiner tech reporter Gina Smith:

Until now, virtually everyone expected VESA’s so-called VL-Bus technology to be the standard for building local bus products. But just two weeks before VESA was planning to announce what it came up with, Intel floored the VESA local bus committee by saying it won’t support the technology after all. In a letter sent to VESA local bus committee officials, Intel stated that supporting VESA’s local bus technology “was no longer in Intel’s best interest.” And sources say it went on to suggest that VESA and Intel should work together to minimize the negative press impact that might arise from the decision.

Good luck, Intel. Because now that Intel plans to announce a competing group that includes hardware heavyweights like IBM, Compaq, NCR and DEC, customers and investors (and yes, the press) are going to wonder what in the world is going on.

Not surprisingly, the people who work for VESA are hurt, confused and angry. “It’s a political nightmare. We’re extremely surprised they’re doing this,” said Ron McCabe, chairman for the committee and a product manager at VESA member Tseng Labs. “We’ll still make money and Intel will still make money, but instead of one standard, there will now be two. And it’s the customer who’s going to get hurt in the end.”

But Intel had seen an opportunity to put its imprint on the computing industry. That opportunity came in the form of PCI, a technology that the firm’s Intel Architecture Labs started developing around 1990, two years before the fateful rejection of VESA. Essentially, Intel had been playing both sides on the standards front.

Why PCI

Why make such a hard shift, screwing over a trusted industry standards body out of nowhere? Beyond wanting to put its mark on the standard, Intel also saw an opportunity to build something more future-proof; something that could benefit not just graphic cards but every expansion card in the machine.

As John R. Quinn wrote in PC Magazine in 1992:

Intel’s PCI bus specification requires more work on the part of peripheral chip-makers, but offers several theoretical advantages over the VL-Bus. In the first place, the specification allows up to ten peripherals to work on the PCI bus (including the PCI controller and an optional expansion-bus controller for ISA, EISA, or MCA). It, too, is limited to 33 MHz, but it allows the PCI controller to use a 32-bit or a 64-bit data connection to the CPU.

In addition, the PCI specification allows the CPU to run concurrently with bus-mastering peripherals—a necessary capability for future multimedia tasks. And the Intel approach allows a full burst mode for reads and writes (Intel’s 486 only allows bursts on reads).

Essentially, the PCI architecture is a CPU-to-local bus bridge with FIFO (first in, first out) buffers. Intel calls it an “intermediate” bus because it is designed to uncouple the CPU from the expansion bus while maintaining a 33-MHz 32-bit path to peripheral devices. By taking this approach, the PCI controller makes it possible to queue writes and reads between the CPU and PCI peripherals. In theory, this would enable manufacturers to use a single motherboard design for several generations of CPUs. It also means more sophisticated controller logic is necessary for the PCI interface and peripheral chips.

To put that all another way, VESA came up with a slightly faster bus standard for the next generation of graphics cards, one just fast enough to meet the needs of Intel’s recent i486 microprocessor users. Intel came up with an interface designed to reshape the next decade of computing, one that it would let its competitors use. This bus would allow people to upgrade their processor across generations without needing to upgrade their motherboard. Intel brought a gun to a knife fight, and it made the whole debate about VL-Bus seem insignificant in short order.

The result was that, no matter how miffed the VESA folks were, Intel had consolidated power for itself by creating an open standard that would eventually win the next generation of computers. Sure, Intel let other companies use the PCI standard, even companies like Apple that weren’t directly doing business with Intel on the CPU side. But Intel, by pushing forth PCI, suddenly made itself relevant to the entire next generation of the computing industry in a way that ensured it would have a second foothold in hardware. The “Intel Inside” marketing label was not limited to the processors, as it turned out.

The influence of Intel’s introduction of PCI is still felt: Thirty-two years later, and three decades after PCI became a major consumer standard, we’re still using PCI derivatives in modern computing devices.

PCI and other standards

Looking at PCI, and its successor PCI express, less as ways that we connect the peripherals we use with our computers, and more as a way for Intel to maintain its dominance over the PC industry, highlights something fascinating about standardization.

It turns out that perhaps Intel’s greatest investment in computing in the 1990s was not the Pentium chipset, but its investment in Intel Architecture Labs, which quietly made the entire computing industry better by working on the things that frustrated consumers and manufacturers alike.

Essentially, as IBM had begun to take its eye off the massive clone market it unwittingly built during this period, Intel used standardization to fill the power void. It worked pretty well, and made the company integral to computer hardware beyond the CPU. In fact, devices you use daily—that Intel played zero part in creating—have benefited greatly from the company’s standards work. If you’ve ever used a device with a USB or Bluetooth connection, you can thank Intel for that.

Five offshoots of the original PCI standard that you may be familiar with


Accelerated Graphics Port. Effectively a PCI-first approach to the VL-Bus standard, a slot dedicated especially to graphics, this port was a way to offer access to faster graphics cards at a time when 3D graphics were starting to hit the market in a big way. Its first appearance came not long after the original PCI standard.

PCI-X. Despite the name, Intel was less involved in this standard, which was intended for high-end workstations and server environments. Instead, the standard was developed by IBM, Compaq, and Hewlett-Packard, doubling the bandwidth of the existing PCI standard—and released in the wild not long before HP and Compaq merged in 2002. But the slot standard was effectively a dead end: It did not see wide use with PCs, likely because Intel chose not to give the technology its blessing, but was briefly utilized by the Power Macintosh G5 line of computers.

PCIe. This is the upgrade to PCI that Intel did choose to bless, and it’s the one used by desktop computers today, in part because it was developed to allow for a huge increase in flexibility compared to PCI, in exchange for somewhat more complexity. Key to PCIe’s approach is the use of “lanes” of data transfer speed, allowing high-speed cards like graphics adapters more bandwidth (up to 16 lanes) and slower technologies like network adapters or audio adapters less. This has given PCIe unparalleled backwards compatibility—it’s technically possible to run a modern card on a first-gen PCIe port in exchange for lower speed—while allowing the standard to continue improving. To give you an idea of how far it’s come: A one-lane fifth-generation PCIe slot is roughly as fast as a 16-lane first-generation slot.

Thunderbolt. Thunderbolt can best be thought of as a way to access PCIe lanes through a cable. First used by Apple in 2011, it has become common on laptops of all stripes in recent years. Unlike PCI and PCIe, which are open to all manufacturers, Thunderbolt is closely associated with Intel. This has meant its competitor AMD had traditionally not offered Thunderbolt ports until USB4—a reworked form of the Thunderbolt 3 standard—emerged.

Non-Volatile Memory Express (NVMe). This popular Intel-backed standard, dating to 2011, has completely rewritten the way we think about storage in computers. Once a technology built around mechanical parts, NVMe has allowed for ever-faster solid-state storage communication speeds that take advantage of innovations in the PCIe spec. Modern NVMe drives, which can reach speeds above 6,000 megabytes per second, are roughly 10 times the speed of comparable SATA solid state drives, which top out at 600 MB/s. And, thanks to the corresponding M.2 expansion card standard, they’re far smaller and significantly easier to install.

Craig Kinnie, the director of Intel Architecture Labs in the 1990s, said it best in 1995, upon coming to an agreement with Microsoft on a 3D graphics architecture for the PC platform. “What’s important to us is we move in the same direction,” he said. “We are working on convergent paths now.”

That was about collaborating with Microsoft. But really, it has been Intel’s modus operandi for decades—what’s good for the technology field is good for Intel. Innovations developed or invented by Intel—like Thunderbolt, Ultrabooks, and Next Unit Computers (NUCs)—have done much to shape the way we buy and use computers.

For all the talk of Moore’s Law as a driving factor behind Intel’s success, the true story might be its sheer cat-herding capabilities. The company that builds the standards builds the industry. Even as Intel faces increasing competition from alliterative processing players like ARM, Apple, and AMD, as long as it doesn’t lose sight of the roles standards played in its success, it might just hold on a few years longer.

Ironically, Intel’s standards-driving winning streak, now more than three decades old, might have all started the day it decided to walk out on a standards body.

Design Tool Think Tank Required

When I was in the EDA industry as a technologist, there were three main parts to my role. The first was to tell customers about new technologies being developed and tool extensions that would be appearing in the next release. These were features they might find beneficial both in the projects they were undertaking today, and even more so, would apply to future projects. Second, I would try and find out what new issues they were finding, or where the tools were not delivering the capabilities they required. This would feed into tool development planning. And finally, I would take those features selected by the marketing team for implementation and try to work out how best to implement them if it wasn’t obvious to the development teams.

By far the most difficult task out of the three was getting new requirements from customers. Most engineers have their heads down, concentrating on getting their latest chip out. When you ask them about new features, the only thing they offer are their current pain points. These usually involve incremental features or bugs, where the workaround is disliked, or insufficient performance.

Thirty years ago, when I first started doing that role, there were dedicated methodology groups within the larger companies whose job it was to develop flows and methodologies for future projects. This would appear to be the ideal people to ask, but in many cases they were so disconnected from the development team that what they asked for would never actually be used by the development team. These groups were idealists who wanted to instill revolutionary changes, whereas the development teams wanted evolutionary tools. The furthest many of those developments went was pilot projects that never became mainstream.

It seems as if the industry needs a better path to get requirements into the EDA companies. This used to be defined by the ITRS, which would look forward and project the new capabilities that would be required and the timeframes for them. That no longer exists. Today, standards are being driven by semiconductor companies. This is a change from the past, where we used to see the EDA companies driving the developments done within groups like Accellera. When I look at their recent undertakings, most of them are driven by end users.

Getting a standards group started today happens fairly late in the process. It implies an immediate need, but does not really allow time for solutions to be developed ahead of time. It appears that a think tank is required where the industry can discuss issues and problems for which new tool development is required. That can then be built into the EDA roadmaps so that the technology becomes available when it is needed.

One such area is power analysis. I have been writing stories about how important power and energy is becoming and may indeed soon become the limiter for many of the most complex designs. Some of the questions I always ask are:

  • What tools are being developed for doing power analysis of software?
  • How can you calculate the energy consumed for a given function?
  • How can users optimize a design for power or energy?

I rarely get straight answers to any of these questions. Instead, I’m often given vague ideas about how a user could do this in a manual fashion given the tools currently available.

I was beginning to think I was barking up the wrong tree and perhaps these were not legitimate concerns. My sanity was restored by a comment on one of my recent power related stories. Allan Cantle, OCP HPC Sub-Project Leader at Open Compute Project Foundation, wrote: “While it’s great to see articles like this highlight the need for us all to focus on energy centric computing, the sad news is that our tools don’t report energy in any obvious way to show the stupid architectural mistakes we often make from an energy consumption perspective. We are solving all the problems from a bottoms-up perspective by bringing things closer together. While that does bring tremendous energy efficiency benefits, it also creates massively increasing energy density. There is so much low-hanging fruit from a top-down system architecture approach that the industry is missing because we need to think outside the box and across our silos.”

Cantle went on to say: “A trivial improvement in tools that report energy consumption as a first-class metric will make it far easier for us to understand and rectify the mistakes we make as we build new energy-centric, domain-specific computers for each application. Alternatively, the silicon gods that rule our industry would be wise to take a step backward and think about the problem from a systems level perspective.”

I couldn’t agree more, and I find it frustrating that no EDA company seems to be listening. I am sure part of the problem is that the large customers are working on their own internal solutions, and they feel it will provide them with a competitive advantage. Until it becomes clear that all of their competitors have similar solutions, and that they no longer get an advantage from it, then they will look to transfer those solutions to the EDA companies so they do not have to maintain them. The EDA companies will then start to fight to make the solution they have acquired the standard. It all takes a long time.

In partial defense of the EDA companies, they are facing so many new issues these days that they are spread very thin dealing with new nodes, 2.5D, 3D, shift left, multi-physics, AI algorithms – to name just a few. They already spend more on R&D than most technology companies as a percentage of revenue.

Perhaps Accellera could start to include discussion forums in events like DVCon. This would allow for an open discussion about the problems they need to have solved. Perhaps they could start to produce the EDA equivalent of the old ITRS roadmap. It sure would save a lot of time and energy (pun intended).

The post Design Tool Think Tank Required appeared first on Semiconductor Engineering.

Commercial Chiplet Ecosystem May Be A Decade Away

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

Experts at the Table: Semiconductor Engineering sat down to talk about the challenges of establishing a commercial chiplet ecosystem with Frank Schirrmeister, vice president solutions and business development at Arteris; Mayank Bhatnagar, product marketing director in the Silicon Solutions Group at Cadence; Paul Karazuba, vice president of marketing at Expedera; Stephen Slater, EDA product management/integrating manager at Keysight; Kevin Rinebold, account technology manager for advanced packaging solutions at Siemens EDA; and Mick Posner, vice president of product management for high-performance computing IP solutions at Synopsys. What follows are excerpts of that discussion.

L-R: Arteris’ Schirrmeister, Cadence’s Bhatnagar, Expedera’s Karazuba, Keysight’s Slater, Siemens EDA’s Rinebold, and Synopsys’ Posner.

SE: There’s a lot of buzz and activity around every aspect of chiplets today. What is your impression of where the commercial chiplet ecosystem stands today?

Schirrmeister: There’s a lot of interest today in an open chiplet ecosystem, but we are probably still quite a bit away from true openness. The proprietary versions of chiplets are alive and kicking out there. We see them in designs. We vendors are all supporting those to make it reality, like the UCIe proponents, but it will take some time to get to a fully open ecosystem. It’s probably at least three to five years before we get to a PCI Express type exchange environment.

Bhatnagar: The commercial chiplet ecosystem is at a very early stage. Many companies are providing chiplets, are designing them, and they’re shipping products — but they’re still single-vendor products, where the same company is designing all the pieces. I hope that with the advancements the UCIe standard is making, and with more standardization, we eventually can get to a marketplace-like environment for chiplets. We are not there.

Karazuba: The commercialization of homogeneous chiplets is pretty well understood by groups like AMD. But for the commercialization of heterogeneous chiplets, which is chiplets from multiple suppliers, there are still a lot of questions out there about that.

Slater: We participate in a lot of the board discussions, and attend industry events like TSMC’s OIP, and there’s a lot of excitement out there at the moment. I see a lot of even midsize and small customers starting to think about their development plans for what chiplet should be. I do think those that are going to be successful first will be those that are within a singular foundry ecosystem like TSMC’s. Today if you’re selecting your IP, you’ve got a variety of ways to be able to pick and choose which IP, see what’s been taped out before, how successful it’s been so you have a way to manage your risk and your costs as you’re putting things together. What we’ll see in the future will be that now you have a choice. Are you purchasing IP, or are you purchasing chiplets? Crucially, it’s all coming from the same foundry and put together in the same manner. The technical considerations of things like UCIe standard packaging versus advanced packaging, and the analysis tool sets for high-speed simulation, as well as for things like thermal, are going to just become that much more important.

Rinebold: I’ve been doing this about 30 years, so I can date back to some of the very earliest days of multi-chip modules and such. When we talk about the ecosystem, there are plenty of examples out there today where we see HBM and logic getting combined at the interposer level. This works if you believe HBM is a chiplet, and that’s a whole other argument. Some would argue that HBM falls into that category. The idea of a true LEGO, snap-together mix and match of chiplets continues to be aspirational for the mainstream market, but there are some business impediments that need to get addressed. Again, there are exceptions in some of the single-vendor solutions, where it’s more or less homogeneous integration, or an entirely vertically integrated type of environment where single vendors are integrating their own chiplets into some pretty impressive packages.

Posner: Aspirational is the word we use for an open ecosystem. I’m going to be a little bit more of a downer by saying I believe it’s 5 to 10 years out. Is it possible? Absolutely. But the biggest issue we see at the moment is a huge knowledge gap in what that really means. And as technology companies become more educated on really what that means, we’ll find that there will be some acceleration in adoption. But within what we call ‘captive’ — within a single company or a micro-ecosystem — we’re seeing multi-die systems pick up.

SE: Is it possible to define the pieces we have today from a technology point of view, to make a commercial chiplet ecosystem a reality?

Rinebold: What’s encouraging is the development of standards. There’s some adoption. We’ve already mentioned UCIe for some of the die-to-die protocols. Organizations like JEDEC announced the extension of their JEP30 PartModel format into the chiplet ecosystem to incorporate chiplet-style data. Think about this as an electronic data sheet. A lot of this work has been incorporated into the CDX working group under Open Compute. That’s encouraging. There were some comments a little bit earlier about having an open marketplace. I would agree we’re probably 3 to 10 years away from that coming to fruition. The underlying framework and infrastructure is there, but a lot of the licensing and distribution issues have to get resolved before you see any type of broad adoption.

Posner: The infrastructure is available. The EDA tools to create, to package, to analyze, to simulate, to manufacture — those tools are all there. The intellectual property that sits around it, either UCIe or some of the more traditional die-to-die interfaces, all of that’s there. What’s not established are full methodology and flows that lead to interoperability. Everything within captive is possible, but a broader ecosystem, a marketplace, is going to require silicon interoperability, simulation, packaging, all of that. That’s the area that we believe is missing — and still building.

Schirrmeister: Do we know what’s required? We probably can define that reasonably well. If the vision is an open ecosystem with IP on chiplets that you can just plug together like LEGO blocks, then the IP industry informs us of what’s required, and then there are some gaps on top of them. I hear people from the hard-coded IP world talking about the equivalent of PDKs for chiplets, but today’s IP ecosystem and the IP deliverables are informing us it doesn’t work like LEGO blocks yet. We are improving every year. But this whole, ‘I take my whiteboard and then everything just magically functions together’ is not what we have today. We need to think really hard about what the additional challenges are when you disaggregate that into chiplets and protocols. Then you get big systemic issues to deal with, like how do you deal with coherency across chiplets? It was challenging enough to get it done on a chip. Now you potentially have to deal with other partnerships you don’t even own. This is not a captive environment in an open ecosystem. That makes it very challenging, and it creates job security for at least 5 to 10 years.

Bhatnagar: On the technical side, what’s going well is adoption. We can see big companies like Intel, and then of course, IP providers like us and Synopsys. Everybody’s working toward standardizing chiplet integration, and that is working very well. EDA tools are also coming up to support that. But we are still very far from a marketplace because there are many issues that are not sorted out, like licensing and a few other things that need a bit more time.

Slater: The standards bodies and networking groups have excited a lot of people, and we’re getting a broad set of customers that are coming along. And one point I was thinking, is this only for very high-end compute? From the companies that I see presenting in those types of forums, it’s even companies working in automotive or aerospace/defense, planning out their future for the next 10 years or more. In the automotive case, it was a company that was thinking about creating chiplets for internal consumption — so maybe reorganizing how they look at creating many different variations or evolutions of their products, trying to do it as more modular chiplet types of blocks. ‘If we take the microprocessor part of it, would we sell that as a chiplet externally for other customers to integrate together into a bigger design?’ For me, the aha moment was seeing how broad the application would be. I do think that the standards work has been moving very fast, and that’s worked really well. For instance, at Keysight EDA, we just released a chiplet PHY designer. It’s a simulation for the high-speed digital link for UCIe, and that only comes about by having a standard that’s published, so an EDA company can take a look at it and realize what they need to do with it. The EDA tools are ready to handle these kinds of things. And maybe then, on to the last point is, in order to share the IP, in order to ensure that it’s available, database and process management is going to become all the more important. You need to keep track of which chip is made on which process, and be able to make it available inside the company to other potential users of that.

SE: What’s in place today from a business perspective, and what still needs to be worked out?

Karazuba: From a business perspective, speaking strictly of heterogeneous chiplets, I don’t think there’s anything really in place. Let me qualify that by asking, ‘Who is responsible for warranty? Who is responsible for testing? Who is responsible for faults? Who is responsible for supply chain?’ With homogeneous chiplets or monolithic silicon, that’s understood because that’s the way that this industry has been doing business since its inception. But when you talk about chiplets that are coming from multiple suppliers, with multiple IPs — and perhaps different interfaces, made in multiple fabs, then constructed by a third party, put together by a third party, tested by a fourth party, and then shipped — what happens when something goes wrong? Who do you point the finger at? Who do you go to and talk to? If a particular chiplet isn’t functioning as intended, it’s not necessarily that chiplet that’s bad. It may be the interface on another chiplet, or on a hub, whatever it might be. We’re going to get there, but right now that’s not understood. It’s not understood who is going to be responsible for things such as that. Is it the multi-chip module manufacturer, or is it the person buying it? I fear a return to the Wintel issue, where the chipmaker points to the OS maker, which points at the hardware maker, which points at the chipmaker. Understanding of the commercial side is is a real barrier to chiplets being adopted. Granted, the technical is much more difficult than the commercial, but I have no doubt the engineers will get there quicker than the business people.

Rinebold: I completely agree. What are the repercussions, warranty-related issues, things like that? I’d also go one step further. If you look at some of the larger silicon foundries right now, there is some concern about taking third-party wafers into their facilities to integrate in some type of heterogeneous, chiplet-type package. There are a lot of business and logistical issues that have to get addressed first. The technical stuff will happen quickly. It’s just a lot of these licensing- and distribution-type issues that need to get resolved. The other thing I want to back up to involves customers in the defense/industrial space. The trust and traceability and the province tracking of IP is going to be key for them, because they have so much expectation of multi-die or chiplet-type packaging as an alternative to monolithic scaling. Just look at all the government programs out there right now, with RESHAPE [Reshore Ecosystem for Secure Heterogeneous Advanced Packaging Electronics] and NGMM [Next-Generation Microelectronics Manufacturing] and such. They’re all in on this chiplet perspective, but they’re going to require a lot of security measures to understand who has touched the IP, where it comes from, how to you verify that.

Posner: Micro-ecosystems are forming because of all these challenges. If you naively think you can just go pick a die off the shelf and put it into your device, how do you warranty that? Who owns it? These micro-ecosystems are building up to fundamentally sort that out. So within a couple of different companies, be it automotive or high-performance compute, they’ll come to terms that are acceptable across all of them. And it’s these micro-ecosystems that are really going to end up driving open chiplets, and I think it’s going to be an organic type of growth. Chiplets are available for a specific application today, but we have this vision that someone else could use it, and we see that with the multiple modes being built into the dies. One mode is, ‘I’m connecting to myself. It’s a very tight, low-latency link.’ But I have this vision in the future that I’m going to need to have an interface or protocol that is more open and uses standard available stacks, and which can be bought off the shelf and integrated. That’s one side of the logistics. I want to mention two more things. It is possible to do interoperability across nodes. We demonstrated our TSMC N3 UCIe with Intel’s in-house UCIe, all put together on an Intel process. This was two separate companies working together, showing the first physical interoperability, so it’s possible. But going forward that’s still just a small part of the overall effort. In the IP space we’ve lived with an IP model of, ‘Build once, sell many.’ With the chiplet marketplace, unless there is a revenue stream from that chiplet, it will break that model. Companies think, ‘I only have to buy the IP once, and then I’m selling my silicon.’ But the infrastructure, the resources that are required to build all of this does not go away. There has to be money at the end of that tunnel for all of these different companies to be investing.

Schirrmeister: Mick is 100% right, but we may have a definition issue here with what we really mean by an ‘open’ chiplet ecosystem. I have two distinct conversations when I talk to partners and customers. On the one hand, you have board designers who are doing more and more integration, and they look at you with a wrinkled forehead and say, ‘We’ve been doing this for years. What are you talking about?’ It may not have been 3D-IC in the classic sense of all items, but they say, ‘Yeah, there are issues with warranties, and the user figures it out.’ The board people arrive from one side of the equation at chiplets because that’s the next evolution of integration. You need to be very efficient. That’s not what we call an open ecosystem of chiplets here. The idea is that you have this marketplace to mix things up, and you have the economies of scale by selling the same chiplet to multiple people. That’s really what the chip designers are thinking about, and some of them think even further because if you do it all in true 3D-IC fashion, then you actually have to co-design those chiplets in a way, and that’s a whole other dimension that needs to be sorted out. To pick a little bit on the big companies that have board and chip design groups in house, you see this even within the messaging of these companies. You have people who come from the board side, and for them it’s not a solved problem. It always has been challenging, but they’re going to take it to the next level. The chip guys are looking at this from a perspective of one interface, like PCI Express, now being UCIe. And then I think about this because the networks on chip need to become super NoCs across chiplets, which poses its own challenges. And that all needs to work together. But those are really chiplets designed for the purpose of being in a chiplet ecosystem. And to that end, Mick’s estimation of longer than five years is probably correct because those purpose-built chiplets, for the purpose of being in an open ecosystem, have all these challenges the board guys have already been dealing with for quite some time. They’re now ‘just getting smaller’ in the amount of integration they do.

Slater: When you put all these chiplets together and start to do that integration, in what order do you start placing the components down? You don’t want to throw away one very expensive chiplet because there was an issue with one of the smaller cheaper ones. So, there are now a lot of thoughts about how to go about doing almost like unit tests on individual chiplets first, but then you want to do some form of system test as you go along. That’s definitely something we need to think about. On the business front,  who is going to be most interested in purchasing a chiplet-style solution. It comes down to whether you have a yield problem. If your chips are getting to the size where you have yield concerns, then definitely it makes sense to think about using chiplets and breaking it up into smaller pieces. Not everything scales, so why move to the lowest process node when you could purchase something at a different process node that has less risk and costs less to manufacture, and then put it all together. The ones that have been successful so far — the big companies like Intel, AMD — were already driven to that edge. The chips got to a size that couldn’t fit on the reticle. We think about how many companies fit into that category, and that will factor into whether or not the cost and risk is worth it for them.

Bhatnagar: From a business perspective, what is really important is the standardization. Inside of the chiplets is fine, but how it impacts other chiplets around it is important. We would like to be able to make something and sell many copies of it. But if there is no standardization, then either we are taking a gamble by going for one thing and assuming everybody moves to it, or we make multiple versions of the same thing and that adds extra costs. To really justify a business case for any chiplet or, or any sort of IP with the chiplet, the standardization is key for the electrical interconnect, packaging, and all other aspects of a system.

Fig. 1:  A chiplet design. Source: Cadence. 

Related Reading
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.
Proprietary Vs. Commercial Chiplets
Who wins, who loses, and where are the big challenges for multi-vendor heterogeneous integration.

The post Commercial Chiplet Ecosystem May Be A Decade Away appeared first on Semiconductor Engineering.

Accellera Preps New Standard For Clock-Domain Crossing

Part of the hierarchical development flow is about to get a lot simpler, thanks to a new standard being created by Accellera. What is less clear is how long will it take before users see any benefit.

At the register transfer level (RTL), when a data signal passes between two flip flops, it initially is assumed that clocks are perfect. After clock-tree synthesis and place-and-route are performed, there can be considerable timing skew between the clock edges arriving those adjacent flops. That makes timing sign-off difficult, but at least the clocks are still synchronous.

But if the clocks come from different sources, are at different frequencies, or a design boundary exists between the flip flops — which would happen with the integration of IP blocks — it’s impossible to guarantee that no clock edges will arrive when the data is unstable. That can cause the output to become unknown for a period of time. This phenomenon, known as metastability, cannot be eliminated, and the verification of those boundaries is known as clock-domain crossing (CDC) analysis.

Special care is required on those boundaries. “You have to compensate for metastability by ensuring that the CDC crossings follow a specific set of logic design principles,” says Prakash Narain, president and CEO of Real Intent. “The general process in use today follows a hierarchical approach and requires that the clock-domain crossing internal to an IP is protected and safe. At the interface of the IP, where the system connects with the IP, two different teams share the problem. An IP provider may recommend an integration methodology, which often is captured in an abstraction model. That abstraction model enables the integration boundary to be verified while the internals of it will not be checked for CDC. That has already been verified.”

In the past, those abstract models differentiated the CDC solutions from veracious vendors. That’s no longer the case. Every IP and tool vendor has different formats, making it costly for everyone. “I don’t know that there’s really anything new or differentiating coming down the pipe for hierarchical modeling,” says Kevin Campbell, technical product manager at Siemens Digital Industries Software. “The creation of the standard will basically deliver much faster results with no loss of quality. I don’t know how much more you can differentiate in that space other than just with performance increases.”

While this has been a problem for the whole industry for quite some time, Intel decided it was time for a solution. The company pushed Accellera to take up the issue, and helped facilitate the creation of the standard by chairing the committee. “I’m going to describe three methods of building a product,” says Iredamola “Dammy” Olopade, chair of the Accellera working group, and a principal engineer at Intel. “Method number one is where you build everything in a monolithic fashion. You own every line of code, you know the architecture, you use the tool of your choice. That is a thing of the past. The second method uses some IP. It leverages reuse and enables the quick turnaround of new SoCs. There used to be a time when all IPs came from the same source, and those were integrating into a product. You could agree upon the tools. We are quickly moving to a world where I need to source IPs wherever I can get them. They don’t use the same tools as I do. In that world, common standards are critical to integrating quickly.”

In some cases, there is a hierarchy of IP. “Clock-domain crossings are a central part of our business,” says Frank Schirrmeister, vice president of solutions and business development at Arteris. “A network-on-chip (NoC) can be considered as ‘CDC central’ because most blocks connected to the NoC have different clocks. Also, our SoC integration tools see all of the blocks to be integrated, and those touch various clock domains and therefore need to deal with the CDC code that is inserted.”

This whole thing can become very messy. “While every solution supports hierarchical modeling, every tool has its own model solution and its own model representation,” says Siemens’ Campbell. “Vendors, or users, are stuck with a CDC solution, because the models were created within a certain solution. There’s no real transportability between any of the hierarchical modeling solutions unless they want to go regenerate models for another solution.”

That creates a lot of extra work. “Today, when dealing with customer CDC issues, we have to consider the customer’s specific environment, and for CDC, a potential mix of in-house flows and commercial tools from various vendors,” says Arteris’ Schirrmeister. “The compatibility matrix becomes very complex, very fast. If adopted, the new Accellera CDC standard bears the potential to make it easier for IP vendors, like us, to ensure compatibility and reduce the effort required to validate IP across multiple customer toolsets. The intent, as specified in the requirements is that ‘every IP provider can run its tool of choice to verify and produce collateral and generate the standard format for SoCs that use a different tool.'”

Everyone benefits. “IP providers will not need to provide extra documentation of clock domains for the SoC integrator to use in their CDC analysis,” says Ahmed Nasr, digital design manager at Mixel. “The standard CDC attributes generated by the EDA tool will be self-contained.”

The use model is relatively simple. “An IP developer signs off on CDC and then exports the abstract model,” says Real Intent’s Narain. “It is likely they will write this out in both the Accellera format and the native format to provide backward compatibility. At the next level of hierarchy, you read in the abstract model instead of reading in the full view of the design. They have various views of the IP, including the CDC view of the IP, which today is on the basis of whatever tool they use for CDC sign-off.”

The potential is significant. “If done right and adopted, the industry may arrive at a common language to describe CDC aspects that can streamline the validation process across various tools and environments used by different users,” says Schirrmeister. “As a result, companies will be able to integrate and validate IP more efficiently than before, accelerating development cycles and reducing the complexity associated with SoC integration.”

The standard
Intel’s Olopade describes the approach that was taken during the creation of the standard. “You take the most complex situations you are likely to find, you box them, and you co-design them in order to reduce the risk of bugs,” he said. “The boundaries you create are supposed to be simple boundaries. We took that concept, and we brought it into our definition to say the following: ‘We will look at all kinds of crossings, we will figure out the simple common uses, and we will cover that first.’ That is expected to cover 95% to 98% of the community. We are not trying to handle 700 different exceptions. It is common. It is simple. It is what guarantees production quality, not just from a CDC standpoint, but just from a divide-and-conquer standpoint.”

That was the starting point. “Then we added elements to our design document that says, ‘This is how we will evaluate complexity, and this is how we’ll determine what we cover first,'” he says. “We broke things down into three steps. Step one is clock-domain crossing. Everyone suffers from this problem. Step two is reset-domain crossing (RDC). As low power is getting into more designs, there are a lot more reset domains, and there is risk between these reset domains. Some companies care, but many companies don’t because they are not in a power-aware environment. It became a secondary consideration. Beyond the basic CDC in phase one, and RDC in phase two, all other interesting, small usage complexities will be handled in phase three as extensions to the standard. We are not going to get bogged down supporting everything under the sun.”

Within the standards group there are two sub-groups — a mapping team and a format team. Common standards, such as AMBA, UCIe, and PCIe have been looked at to make sure that these are fully covered by the standard. That means that the concepts should be useful for future markets.

“The concepts contained in the standard are extensible to hardened chiplets,” says Mixel’s Nasr. “By providing an accurate standard CDC view for the chiplet, it will enable integration with other chiplets.”

Some of those issues have yet to be fully explored. “The standard’s current documentation primarily focuses on clock-domain crossing within an SoC itself,” says Schirrmeister. “Its direct applicability to the area of chiplets would depend on further developments. The interfaces between fully hardened IP blocks on chiplets would communicate through standard interfaces like UCIe, BoW, or XSR, so the synchronization issues between chiplets on substrates would appear to be elevated to the protocol levels.”

Reset-domain crossings have yet to appear in the standard. “The genesis of CDC is asynchronous clocks,” says Narain. “But the genesis for reset-domain crossing is asynchronous resets. While the destination is due to the clock, the source of the problem is somewhere else. And as a result, the nature of the problem, the methodology that people use to manage that problem, are very different. The kind of information that you need to retain, and the kind of information that you can throw away, is different for every problem. Hence, abstractions are actually very customized for the application.”

Does the standard cover enough ground? That is part of the purpose of the review period that was used to collect information. “I can see some room for future improvement — for example, making some attributes mandatory like logic, associated_clocks, clock_period for clock ports,” says Nasr. “Another proposed improvement is adding reconvergence information, to be able to detect reconverging outputs of parallel synchronizers.”

The impact of all of this, if realized, is enormous. “If you truly run a collaborative, inclusive, development cycle, two things will happen,” says Olopade. “One, you are going to be able to find multiple ways to solve each problem. You need to understand the pros and cons against the real problems you are trying to solve and agree on the best way we should do it together. For each of those, we record the options, the pros and cons, and the reason one was selected. In a public review, those that couldn’t be part of that discussion get to weigh in. We weigh it against what they are suggesting versus why did we choose it. In the cases where it is part of what we addressed, and we justified it, we just respond, and we do not make a change. If you’re truly inclusive, you do allow that feedback to cause you to change your mind. We received feedback on about three items that we had debated, where the feedback challenged the decisions and got us to rehash things.”

The big challenge
Still, the creation of a standard is just the first step. Unless a standard is fully adopted, its value becomes diminished. “It’s a commendable objective and a worthy endeavor,” says Schirrmeister. “It will make interoperability easier and eventually allow us, and the whole industry, to reduce the compatibility matrix we maintain to deal with vendor tools individually. It all will depend on adoption by the vendors, though.”

It is off to a good start. “As with any standard, good intentions sometimes get severed by reality,” says Campbell. “There has been significant collaboration and agreements on how the standard is being pushed forward. We did not see self-submarining, or some parties playing nice just to see what’s going on but not really supporting it. This does seem like good collaboration and good decision making across the board.”

Implementation is another hurdle. “Will it actually provide the benefit that it is supposed to provide?” asks Narain. “That will depend upon how completely and how quickly EDA tool vendors provide support for the standard. From our perception, the engineering challenge for implementing this is not that large. When this is standardized, we will provide support for it as soon as we can.”

Even then, adoption isn’t a slam dunk. “There are short- and long-term problems,” warns Campbell. “IP vendors already have to support multiple formats, but now you have to add Accellera on top of that. There’s going to be some pain both for the IP vendors and for EDA vendors. We are going to have to be backward-compatible and some programs go on for decades. There’s a chance that some of these models will be around for a very long time. That’s the short-term pain. But the biggest hurdle to overcome for a third-party IP vendor, and EDA vendor, is quality assurance. The whole point of a hierarchical development methodology is faster CDC closure with no loss in quality. The QA load here is going to be big, because no customer is going to want to take the risk if they’ve got a solution that is already working well.”

Some of those issues and fears are expected to be addressed at the upcoming DVCon conference. “We will be providing a tutorial on CDC,” says Olopade. “The first 30 minutes covers the basics of CDC for those who haven’t been doing this for the last 10 years. The next hour will talk about the Accellera solution. It will concentrate on those topics which were hotly debated, and we need to help people understand, or carry people along with what we recommend. Then it may become more acceptable and more adoptive.”

Related Reading
Design And Verification Methodologies Breaking Down
As chips become more complex, existing tools and methodologies are stretched to the breaking point.

The post Accellera Preps New Standard For Clock-Domain Crossing appeared first on Semiconductor Engineering.

AI Is Being Built on Dated, Flawed Motion-Capture Data



Diversity of thought in industrial design is crucial: If no one thinks to design a technology for multiple body types, people can get hurt. The invention of seat belts is an oft-cited example of this phenomenon, as they were designed based on crash dummies that had traditionally male proportions, reflecting the bodies of the team members working on them.

The same phenomenon is now at work in the field of motion-capture technology. Throughout history, scientists have endeavored to understand how the human body moves. But how do we define the human body? Decades ago many studies assessed “healthy male” subjects; others used surprising models like dismembered cadavers. Even now, some modern studies used in the design of fall-detection technology rely on methods like hiring stunt actors who pretend to fall.

Over time, a variety of flawed assumptions have become codified into standards for motion-capture data that’s being used to design some AI-based technologies. These flaws mean that AI-based applications may not be as safe for people who don’t fit a preconceived “typical” body type, according to new work recently published as a preprint and set to be presented at the Conference on Human Factors in Computing Systems in May.

“We dug into these so-called gold standards being used for all kinds of studies and designs, and many of them had errors or were focused on a very particular type of body,” says Abigail Jacobs, coauthor of the study and an assistant professor at the University of Michigan’s School of Information and the Center for the Study of Complex Systems. “We want engineers to be aware of how these social aspects become coded into the technical—hidden in mathematical models that seem objective or infrastructural.”

It’s an important moment for AI-based systems, Jacobs says, as we may still have time to catch and avoid potentially dangerous assumptions from being codified into applications informed by AI.

Motion-capture systems create representations of bodies by collecting data from sensors placed on the subjects, logging how these bodies move through space. These schematics become part of the tools that researchers use, such as open-source libraries of movement data and measurement systems that are meant to provide baseline standards for how human bodies move. Developers are increasingly using these baselines to build all manner of AI-based applications: fall-detection algorithms for smartwatches and other wearables, self-driving vehicles that need to detect pedestrians, computer-generated imagery for movies and video games, manufacturing equipment that interacts safely with human workers, and more.

“Many researchers don’t have access to advanced motion-capture labs to collect data, so we’re increasingly relying on benchmarks and standards to build new tech,” Jacobs says. “But when these benchmarks don’t include representations of all bodies, especially those people who are likely to be involved in real-world use cases—like elderly people who may fall—these standards can be quite flawed.”

She hopes we can learn from past mistakes, such as cameras that didn’t accurately capture all skin tones and seat belts and airbags that didn’t protect people of all shapes and sizes in car crashes.

The Cadaver in the Machine

Jacobs and her collaborators from Cornell University, Intel, and the University of Virginia performed a systematic literature review of 278 motion-capture-related studies. In most cases, they concluded, motion-capture systems captured the motion of “those who are male, white, ‘able-bodied,’ and of unremarkable weight.”

And sometimes these white male bodies were dead. In reviewing works dating back to the 1930s and running through three historical eras of motion-capture science, the researchers studied projects that were influential in how scientists of the time understood the movement of body segments. A seminal 1955 study funded by the Air Force, for example, used overwhelmingly white, male, and slender or athletic bodies to create the optimal cockpit based on pilots’ range of motion. That study also gathered data from eight dismembered cadavers.

A full 20 years later, a study prepared for the National Highway Traffic Safety Administration used similar methods: Six dismembered male cadavers were used to inform the design of impact-protection systems in vehicles.

In most of the 278 studies reviewed, motion-capture systems captured the motion of “those who are male, white, ‘able-bodied,’ and of unremarkable weight.”

Although those studies are many decades old, these assumptions became baked in over time. Jacobs and her colleagues found many examples of these outdated inferences being passed down to later studies and ultimately still influencing modern motion-capture studies.

“If you look at technical documents of a modern system in production, they’ll explain the ‘traditional baseline standards’ they’re using,” Jacobs says. “By digging through that, you quickly start hopping through time: OK, that’s based on this prior study, which is based on this one, which is based on this one, and eventually we’re back to the Air Force study designing cockpits with frozen cadavers.”

The components that underpin technological best practices are “man-made—intentional emphasis on man, rather than human—often preserving biases and inaccuracies from the past,” says Kasia Chmielinski, project lead of the Data Nutrition Project and a fellow at Stanford University’s Digital Civil Society Lab. “Thus historical errors often inform the ‘neutral’ basis of our present-day technological systems. This can lead to software and hardware that does not work equally for all populations, experiences, or purposes.”

These problems may hinder engineers who want to make things right, Chmielinski says. “Since many of these issues are baked into the foundational elements of the system, teams innovating today may not have quick recourse to address bias or error, even if they want to,” they say. “If you’re building an application that uses third-party sensors, and the sensors themselves have a bias in what they detect or do not detect, what is the appropriate recourse?”

Jacobs says that engineers must interrogate their sources of “ground truth” and confirm that the gold standards they measure against are, in fact, gold. Technicians must consider these social evaluations to be part of their jobs in order to design technologies for all.

“If you go in saying, ‘I know that human assumptions get built in and are often hidden or obscured,’ that will inform how you choose what’s in your dataset and how you report it in your work,” Jacobs says. “It’s sociotechnical, and technologists need that lens to be able to say: My system does what I say it does, and it doesn’t create undue harm.”

❌