FreshRSS

Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.
PředevčíremHlavní kanál
  • ✇IEEE Spectrum
  • How Large Language Models Are Changing My JobGlenn Zorpette
    Generative artificial intelligence, and large language models in particular, are starting to change how countless technical and creative professionals do their jobs. Programmers, for example, are getting code segments by prompting large language models. And graphic arts software packages such as Adobe Illustrator already have tools built in that let designers conjure illustrations, images, or patterns by describing them.But such conveniences barely hint at the massive, sweeping changes to employ
     

How Large Language Models Are Changing My Job

6. Červen 2024 v 15:59


Generative artificial intelligence, and large language models in particular, are starting to change how countless technical and creative professionals do their jobs. Programmers, for example, are getting code segments by prompting large language models. And graphic arts software packages such as Adobe Illustrator already have tools built in that let designers conjure illustrations, images, or patterns by describing them.

But such conveniences barely hint at the massive, sweeping changes to employment predicted by some analysts. And already, in ways large and small, striking and subtle, the tech world’s notables are grappling with changes, both real and envisioned, wrought by the onset of generative AI. To get a better idea of how some of them view the future of generative AI, IEEE Spectrum asked three luminaries—an academic leader, a regulator, and a semiconductor industry executive—about how generative AI has begun affecting their work. The three, Andrea Goldsmith, Juraj Čorba, and Samuel Naffziger, agreed to speak with Spectrum at the 2024 IEEE VIC Summit & Honors Ceremony Gala, held in May in Boston.

Click to read more thoughts from:

  1. Andrea Goldsmith, dean of engineering at Princeton University.
  2. Juraj Čorba, senior expert on digital regulation and governance, Slovak Ministry of Investments, Regional Development
  3. Samuel Naffziger, senior vice president and a corporate fellow at Advanced Micro Devices

Andrea Goldsmith

Andrea Goldsmith is dean of engineering at Princeton University.

There must be tremendous pressure now to throw a lot of resources into large language models. How do you deal with that pressure? How do you navigate this transition to this new phase of AI?

A woman with brown shoulder length hair smiles for a portrait in a teal jacket in an outside scene Andrea J. Goldsmith

Andrea Goldsmith: Universities generally are going to be very challenged, especially universities that don’t have the resources of a place like Princeton or MIT or Stanford or the other Ivy League schools. In order to do research on large language models, you need brilliant people, which all universities have. But you also need compute power and you need data. And the compute power is expensive, and the data generally sits in these large companies, not within universities.

So I think universities need to be more creative. We at Princeton have invested a lot of money in the computational resources for our researchers to be able to do—well, not large language models, because you can’t afford it. To do a large language model… look at OpenAI or Google or Meta. They’re spending hundreds of millions of dollars on compute power, if not more. Universities can’t do that.

But we can be more nimble and creative. What can we do with language models, maybe not large language models but with smaller language models, to advance the state of the art in different domains? Maybe it’s vertical domains of using, for example, large language models for better prognosis of disease, or for prediction of cellular channel changes, or in materials science to decide what’s the best path to pursue a particular new material that you want to innovate on. So universities need to figure out how to take the resources that we have to innovate using AI technology.

We also need to think about new models. And the government can also play a role here. The [U.S.] government has this new initiative, NAIRR, or National Artificial Intelligence Research Resource, where they’re going to put up compute power and data and experts for educators to use—researchers and educators.

That could be a game-changer because it’s not just each university investing their own resources or faculty having to write grants, which are never going to pay for the compute power they need. It’s the government pulling together resources and making them available to academic researchers. So it’s an exciting time, where we need to think differently about research—meaning universities need to think differently. Companies need to think differently about how to bring in academic researchers, how to open up their compute resources and their data for us to innovate on.

As a dean, you are in a unique position to see which technical areas are really hot, attracting a lot of funding and attention. But how much ability do you have to steer a department and its researchers into specific areas? Of course, I’m thinking about large language models and generative AI. Is deciding on a new area of emphasis or a new initiative a collaborative process?

Goldsmith: Absolutely. I think any academic leader who thinks that their role is to steer their faculty in a particular direction does not have the right perspective on leadership. I describe academic leadership as really about the success of the faculty and students that you’re leading. And when I did my strategic planning for Princeton Engineering in the fall of 2020, everything was shut down. It was the middle of COVID, but I’m an optimist. So I said, “Okay, this isn’t how I expected to start as dean of engineering at Princeton.” But the opportunity to lead engineering in a great liberal arts university that has aspirations to increase the impact of engineering hasn’t changed. So I met with every single faculty member in the School of Engineering, all 150 of them, one-on-one over Zoom.

And the question I asked was, “What do you aspire to? What should we collectively aspire to?” And I took those 150 responses, and I asked all the leaders and the departments and the centers and the institutes, because there already were some initiatives in robotics and bioengineering and in smart cities. And I said, “I want all of you to come up with your own strategic plans. What do you aspire to in these areas? And then let’s get together and create a strategic plan for the School of Engineering.” So that’s what we did. And everything that we’ve accomplished in the last four years that I’ve been dean came out of those discussions, and what it was the faculty and the faculty leaders in the school aspired to.

So we launched a bioengineering institute last summer. We just launched Princeton Robotics. We’ve launched some things that weren’t in the strategic plan that bubbled up. We launched a center on blockchain technology and its societal implications. We have a quantum initiative. We have an AI initiative using this powerful tool of AI for engineering innovation, not just around large language models, but it’s a tool—how do we use it to advance innovation and engineering? All of these things came from the faculty because, to be a successful academic leader, you have to realize that everything comes from the faculty and the students. You have to harness their enthusiasm, their aspirations, their vision to create a collective vision.

Juraj Čorba

Juraj Čorba is senior expert on digital regulation and governance, Slovak Ministry of Investments, Regional Development, and Information, and Chair of the Working Party on Governance of AI at the Organization for Economic Cooperation and Development.

What are the most important organizations and governing bodies when it comes to policy and governance on artificial intelligence in Europe?

Portrait of a clean-shaven man with brown hair wearing a blue button down shirt. Juraj Čorba

Juraj Čorba: Well, there are many. And it also creates a bit of a confusion around the globe—who are the actors in Europe? So it’s always good to clarify. First of all we have the European Union, which is a supranational organization composed of many member states, including my own Slovakia. And it was the European Union that proposed adoption of a horizontal legislation for AI in 2021. It was the initiative of the European Commission, the E.U. institution, which has a legislative initiative in the E.U. And the E.U. AI Act is now finally being adopted. It was already adopted by the European Parliament.

So this started, you said 2021. That’s before ChatGPT and the whole large language model phenomenon really took hold.

Čorba: That was the case. Well, the expert community already knew that something was being cooked in the labs. But, yes, the whole agenda of large models, including large language models, came up only later on, after 2021. So the European Union tried to reflect that. Basically, the initial proposal to regulate AI was based on a blueprint of so-called product safety, which somehow presupposes a certain intended purpose. In other words, the checks and assessments of products are based more or less on the logic of the mass production of the 20th century, on an industrial scale, right? Like when you have products that you can somehow define easily and all of them have a clearly intended purpose. Whereas with these large models, a new paradigm was arguably opened, where they have a general purpose.

So the whole proposal was then rewritten in negotiations between the Council of Ministers, which is one of the legislative bodies, and the European Parliament. And so what we have today is a combination of this old product-safety approach and some novel aspects of regulation specifically designed for what we call general-purpose artificial intelligence systems or models. So that’s the E.U.

By product safety, you mean, if AI-based software is controlling a machine, you need to have physical safety.

Čorba: Exactly. That’s one of the aspects. So that touches upon the tangible products such as vehicles, toys, medical devices, robotic arms, et cetera. So yes. But from the very beginning, the proposal contained a regulation of what the European Commission called stand-alone systems—in other words, software systems that do not necessarily command physical objects. So it was already there from the very beginning, but all of it was based on the assumption that all software has its easily identifiable intended purpose—which is not the case for general-purpose AI.

Also, large language models and generative AI in general brings in this whole other dimension, of propaganda, false information, deepfakes, and so on, which is different from traditional notions of safety in real-time software.

Čorba: Well, this is exactly the aspect that is handled by another European organization, different from the E.U., and that is the Council of Europe. It’s an international organization established after the Second World War for the protection of human rights, for protection of the rule of law, and protection of democracy. So that’s where the Europeans, but also many other states and countries, started to negotiate a first international treaty on AI. For example, the United States have participated in the negotiations, and also Canada, Japan, Australia, and many other countries. And then these particular aspects, which are related to the protection of integrity of elections, rule-of-law principles, protection of fundamental rights or human rights under international law—all these aspects have been dealt with in the context of these negotiations on the first international treaty, which is to be now adopted by the Committee of Ministers of the Council of Europe on the 16th and 17th of May. So, pretty soon. And then the first international treaty on AI will be submitted for ratifications.

So prompted largely by the activity in large language models, AI regulation and governance now is a hot topic in the United States, in Europe, and in Asia. But of the three regions, I get the sense that Europe is proceeding most aggressively on this topic of regulating and governing artificial intelligence. Do you agree that Europe is taking a more proactive stance in general than the United States and Asia?

Čorba: I’m not so sure. If you look at the Chinese approach and the way they regulate what we call generative AI, it would appear to me that they also take it very seriously. They take a different approach from the regulatory point of view. But it seems to me that, for instance, China is taking a very focused and careful approach. For the United States, I wouldn’t say that the United States is not taking a careful approach because last year you saw many of the executive orders, or even this year, some of the executive orders issued by President Biden. Of course, this was not a legislative measure, this was a presidential order. But it seems to me that the United States is also trying to address the issue very actively. The United States has also initiated the first resolution of the General Assembly at the U.N. on AI, which was passed just recently. So I wouldn’t say that the E.U. is more aggressive in comparison with Asia or North America, but maybe I would say that the E.U. is the most comprehensive. It looks horizontally across different agendas and it uses binding legislation as a tool, which is not always the case around the world. Many countries simply feel that it’s too early to legislate in a binding way, so they opt for soft measures or guidance, collaboration with private companies, et cetera. Those are the differences that I see.

Do you think you perceive a difference in focus among the three regions? Are there certain aspects that are being more aggressively pursued in the United States than in Europe or vice versa?

Čorba: Certainly the E.U. is very focused on the protection of human rights, the full catalog of human rights, but also, of course, on safety and human health. These are the core goals or values to be protected under the E.U. legislation. As for the United States and for China, I would say that the primary focus in those countries—but this is only my personal impression—is on national and economic security.

Samuel Naffziger

Samuel Naffziger is senior vice president and a corporate fellow at Advanced Micro Devices, where he is responsible for technology strategy and product architectures. Naffziger was instrumental in AMD’s embrace and development of chiplets, which are semiconductor dies that are packaged together into high-performance modules.

To what extent is large language model training starting to influence what you and your colleagues do at AMD?

Portrait of a brown haired man in a dark blue shirt. Samuel Naffziger

Samuel Naffziger: Well, there are a couple levels of that. LLMs are impacting the way a lot of us live and work. And we certainly are deploying that very broadly internally for productivity enhancements, for using LLMs to provide starting points for code—simple verbal requests, such as “Give me a Python script to parse this dataset.” And you get a really nice starting point for that code. Saves a ton of time. Writing verification test benches, helping with the physical design layout optimizations. So there’s a lot of productivity aspects.

The other aspect to LLMs is, of course, we are actively involved in designing GPUs [graphics processing units] for LLM training and for LLM inference. And so that’s driving a tremendous amount of workload analysis on the requirements, hardware requirements, and hardware-software codesign, to explore.

So that brings us to your current flagship, the Instinct MI300X, which is actually billed as an AI accelerator. How did the particular demands influence that design? I don’t know when that design started, but the ChatGPT era started about two years ago or so. To what extent did you read the writing on the wall?

Naffziger: So we were just into the MI300—in 2019, we were starting the development. A long time ago. And at that time, our revenue stream from the Zen [an AMD architecture used in a family of processors] renaissance had really just started coming in. So the company was starting to get healthier, but we didn’t have a lot of extra revenue to spend on R&D at the time. So we had to be very prudent with our resources. And we had strategic engagements with the [U.S.] Department of Energy for supercomputer deployments. That was the genesis for our MI line—we were developing it for the supercomputing market. Now, there was a recognition that munching through FP64 COBOL code, or Fortran, isn’t the future, right? [laughs] This machine-learning [ML] thing is really getting some legs.

So we put some of the lower-precision math formats in, like Brain Floating Point 16 at the time, that were going to be important for inference. And the DOE knew that machine learning was going to be an important dimension of supercomputers, not just legacy code. So that’s the way, but we were focused on HPC [high-performance computing]. We had the foresight to understand that ML had real potential. Although certainly no one predicted, I think, the explosion we’ve seen today.

So that’s how it came about. And, just another piece of it: We leveraged our modular chiplet expertise to architect the 300 to support a number of variants from the same silicon components. So the variant targeted to the supercomputer market had CPUs integrated in as chiplets, directly on the silicon module. And then it had six of the GPU chiplets we call XCDs around them. So we had three CPU chiplets and six GPU chiplets. And that provided an amazingly efficient, highly integrated, CPU-plus-GPU design we call MI300A. It’s very compelling for the El Capitan supercomputer that’s being brought up as we speak.

But we also recognize that for the maximum computation for these AI workloads, the CPUs weren’t that beneficial. We wanted more GPUs. For these workloads, it’s all about the math and matrix multiplies. So we were able to just swap out those three CPU chiplets for a couple more XCD GPUs. And so we got eight XCDs in the module, and that’s what we call the MI300X. So we kind of got lucky having the right product at the right time, but there was also a lot of skill involved in that we saw the writing on the wall for where these workloads were going and we provisioned the design to support it.

Earlier you mentioned 3D chiplets. What do you feel is the next natural step in that evolution?

Naffziger: AI has created this bottomless thirst for more compute [power]. And so we are always going to be wanting to cram as many transistors as possible into a module. And the reason that’s beneficial is, these systems deliver AI performance at scale with thousands, tens of thousands, or more, compute devices. They all have to be tightly connected together, with very high bandwidths, and all of that bandwidth requires power, requires very expensive infrastructure. So if a certain level of performance is required—a certain number of petaflops, or exaflops—the strongest lever on the cost and the power consumption is the number of GPUs required to achieve a zettaflop, for instance. And if the GPU is a lot more capable, then all of that system infrastructure collapses down—if you only need half as many GPUs, everything else goes down by half. So there’s a strong economic motivation to achieve very high levels of integration and performance at the device level. And the only way to do that is with chiplets and with 3D stacking. So we’ve already embarked down that path. A lot of tough engineering problems to solve to get there, but that’s going to continue.

And so what’s going to happen? Well, obviously we can add layers, right? We can pack more in. The thermal challenges that come along with that are going to be fun engineering problems that our industry is good at solving.

  • ✇Semiconductor Engineering
  • SRAM Security Concerns GrowKaren Heyman
    SRAM security concerns are intensifying as a combination of new and existing techniques allow hackers to tap into data for longer periods of time after a device is powered down. This is particularly alarming as the leading edge of design shifts from planar SoCs to heterogeneous systems in package, such as those used in AI or edge processing, where chiplets frequently have their own memory hierarchy. Until now, most cybersecurity concerns involving volatile memory have focused on DRAM, because it
     

SRAM Security Concerns Grow

9. Květen 2024 v 09:08

SRAM security concerns are intensifying as a combination of new and existing techniques allow hackers to tap into data for longer periods of time after a device is powered down.

This is particularly alarming as the leading edge of design shifts from planar SoCs to heterogeneous systems in package, such as those used in AI or edge processing, where chiplets frequently have their own memory hierarchy. Until now, most cybersecurity concerns involving volatile memory have focused on DRAM, because it is often external and easier to attack. SRAM, in contrast, does not contain a component as obviously vulnerable as a heat-sensitive capacitor, and in the past it has been harder to pinpoint. But as SoCs are disaggregated and more features are added into devices, SRAM is becoming a much bigger security concern.

The attack scheme is well understood. Known as cold boot, it was first identified in 2008, and is essentially a variant of a side-channel attack. In a cold boot approach, an attacker dumps data from internal SRAM to an external device, and then restarts the system from the external device with some code modification. “Cold boot is primarily targeted at SRAM, with the two primary defenses being isolation and in-memory encryption,” said Vijay Seshadri, distinguished engineer at Cycuity.

Compared with network-based attacks, such as DRAM’s rowhammer, cold boot is relatively simple. It relies on physical proximity and a can of compressed air.

The vulnerability was first described by Edward Felton, director of Princeton University’s Center for Information Technology Policy, J. Alex Halderman, currently director of the Center for Computer Security & Society at the University of Michigan, and colleagues. The breakthrough in their research was based on the growing realization in the engineering research community that data does not vanish from memory the moment a device is turned off, which until then was a common assumption. Instead, data in both DRAM and SRAM has a brief “remanence.”[1]

Using a cold boot approach, data can be retrieved, especially if an attacker sprays the chip with compressed air, cooling it enough to slow the degradation of the data. As the researchers described their approach, “We obtained surface temperatures of approximately −50°C with a simple cooling technique — discharging inverted cans of ‘canned air’ duster spray directly onto the chips. At these temperatures, we typically found that fewer than 1% of bits decayed even after 10 minutes without power.”

Unfortunately, despite nearly 20 years of security research since the publication of the Halderman paper, the authors’ warning still holds true. “Though we discuss several strategies for mitigating these risks, we know of no simple remedy that would eliminate them.”

However unrealistic, there is one simple and obvious remedy to cold boot — never leave a device unattended. But given human behavior, it’s safer to assume that every device is vulnerable, from smart watches to servers, as well as automotive chips used for increasingly autonomous driving.

While the original research exclusively examined DRAM, within the last six years cold boot has proven to be one of the most serious vulnerabilities for SRAM. In 2018, researchers at Germany’s Technische Universität Darmstadt published a paper describing a cold boot attack method that is highly resistant to memory erasure techniques, and which can be used to manipulate the cryptographic keys produced by the SRAM physical unclonable function (PUF).

As with so many security issues, it’s been a cat-and-mouse game between remedies and counter-attacks. And because cold boot takes advantage of slowing down memory degradation, in 2022 Yang-Kyu Choi and colleagues at the Korea Advanced Institute of Science and Technology (KAIST), described a way to undo the slowdown with an ultra-fast data sanitization method that worked within 5 ns, using back bias to control the device parameters of CMOS.

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Their paper, as well as others, have inspired new approaches to combating cold boot attacks.

“To mitigate the risk of unauthorized access from unknown devices, main devices, or servers, check the authenticated code and unique identity of each accessing device,” said Jongsin Yun, memory technologist at Siemens EDA. “SRAM PUF is one of the ways to securely identify each device. SRAM is made of two inverters cross-coupled to each other. Although each inverter is designed to be the same device, normally one part of the inverter has a somewhat stronger NMOS than the other due to inherent random dopant fluctuation. During the initial power-on process, SRAM data will be either data 1 or 0, depending on which side has a stronger device. In other words, the initial data state of the SRAM array at the power on is decided by this unique random process variation and most of the bits maintain this property for life. One can use this unique pattern as a fingerprint of a device. The SRAM PUF data is reconstructed with other coded data to form a cryptographic key. SRAM PUF is a great way to anchor its secure data into hardware. Hackers may use a DFT circuit to access the memory. To avoid insecurely reading the SRAM information through DFT, the security-critical design makes DFT force delete the data as an initial process of TEST mode.”

However, there can be instances where data may be required to be kept in a non-volatile memory (NVM). “Data is considered insecure if the NVM is located outside of the device,” said Yun. “Therefore, secured data needs to be stored within the device with write protection. One-time programmable (OTP) memory or fuses are good storage options to prevent malicious attackers from tampering with the modified information. OTP memory and fuses are used to store cryptographic keys, authentication information, and other critical settings for operation within the device. It is useful for anti-rollback, which prevents hackers from exploiting old vulnerabilities that have been fixed in newer versions.”

Chiplet vulnerabilities
Chiplets also could present another vector for attack, due to their complexity and interconnections. “A chiplet has memory, so it’s going to be attacked,” said Cycuity’s Seshadri. “Chiplets, in general, are going to exacerbate the problem, rather than keeping it status quo, because you’re going to have one chiplet talking to another. Could an attack on one chiplet have a side effect on another? There need to be standards to address this. In fact, they’re coming into play already. A chiplet provider has to say, ‘Here’s what I’ve done for security. Here’s what needs to be done when interfacing with another chiplet.”

Yun notes there is a further physical vulnerability for those working with chiplets and SiPs. “When multiple chiplets are connected to form a SiP, we have to trust data coming from an external chip, which creates further complications. Verification of the chiplet’s authenticity becomes very important for SiPs, as there is a risk of malicious counterfeit chiplets being connected to the package for hacking purposes. Detection of such counterfeit chiplets is imperative.”

These precautions also apply when working with DRAM. In all situations, Seshardi said, thinking about security has to go beyond device-level protection. “The onus of protecting DRAM is not just on the DRAM designer or the memory designer,” he said. “It has to be secured by design principles when you are developing. In addition, you have to look at this holistically and do it at a system level. You must consider all the other things that communicate with DRAM or that are placed near DRAM. You must look at a holistic solution, all the way from software down to things like the memory controller and then finally, the DRAM itself.”

Encryption as a backup
Data itself always must be encrypted as second layer of protection against known and novel attacks, so an organization’s assets will still be protected even if someone breaks in via cold boot or another method.

“The first and primary method of preventing a cold boot attack is limiting physical access to the systems, or physically modifying the systems case or hardware preventing an attacker’s access,” said Jim Montgomery, market development director, semiconductor at TXOne Networks. “The most effective programmatic defense against an attack is to ensure encryption of memory using either a hardware- or software-based approach. Utilizing memory encryption will ensure that regardless of trying to dump the memory, or physically removing the memory, the encryption keys will remain secure.”

Montgomery also points out that TXOne is working with the Semiconductor Manufacturing Cybersecurity Consortium (SMCC) to develop common criteria based upon SEMI E187 and E188 standards to assist DM’s and OEM’s to implement secure procedures for systems security and integrity, including controlling the physical environment.

What kind and how much encryption will depend on use cases, said Jun Kawaguchi, global marketing executive for Winbond. “Encryption strength for a traffic signal controller is going to be different from encryption for nuclear plants or medical devices, critical applications where you need much higher levels,” he said. “There are different strengths and costs to it.”

Another problem, in the post-quantum era, is that encryption itself may be vulnerable. To defend against those possibilities, researchers are developing post-quantum encryption schemes. One way to stay a step ahead is homomorphic encryption [HE], which will find a role in data sharing, since computations can be performed on encrypted data without first having to decrypt it.

Homomorphic encryption could be in widespread use as soon as the next few years, according to Ronen Levy, senior manager for IBM’s Cloud Security & Privacy Technologies Department, and Omri Soceanu, AI Security Group manager at IBM.  However, there are still challenges to be overcome.

“There are three main inhibitors for widespread adoption of homomorphic encryption — performance, consumability, and standardization,” according to Levy. “The main inhibitor, by far, is performance. Homomorphic encryption comes with some latency and storage overheads. FHE hardware acceleration will be critical to solving these issues, as well as algorithmic and cryptographic solutions, but without the necessary expertise it can be quite challenging.”

An additional issue is that most consumers of HE technology, such as data scientists and application developers, do not possess deep cryptographic skills, HE solutions that are designed for cryptographers can be impractical. A few HE solutions require algorithmic and cryptographic expertise that inhibit adoption by those who lack these skills.

Finally, there is a lack of standardization. “Homomorphic encryption is in the process of being standardized,” said Soceanu. “But until it is fully standardized, large organizations may be hesitant to adopt a cryptographic solution that has not been approved by standardization bodies.”

Once these issues are resolved, they predicted widespread use as soon as the next few years. “Performance is already practical for a variety of use cases, and as hardware solutions for homomorphic encryption become a reality, more use cases would become practical,” said Levy. “Consumability is addressed by creating more solutions, making it easier and hopefully as frictionless as possible to move analytics to homomorphic encryption. Additionally, standardization efforts are already in progress.”

A new attack and an old problem
Unfortunately, security never will be as simple as making users more aware of their surroundings. Otherwise, cold boot could be completely eliminated as a threat. Instead, it’s essential to keep up with conference talks and the published literature, as graduate students keep probing SRAM for vulnerabilities, hopefully one step ahead of genuine attackers.

For example, SRAM-related cold boot attacks originally targeted discrete SRAM. The reason is that it’s far more complicated to attack on-chip SRAM, which is isolated from external probing and has minimal intrinsic capacitance. However, in 2022, Jubayer Mahmod, then a graduate student at Virginia Tech and his advisor, associate professor Matthew Hicks, demonstrated what they dubbed “Volt Boot,” a new method that could penetrate on-chip SRAM. According to their paper, “Volt Boot leverages asymmetrical power states (e.g., on vs. off) to force SRAM state retention across power cycles, eliminating the need for traditional cold boot attack enablers, such as low-temperature or intrinsic data retention time…Unlike other forms of SRAM data retention attacks, Volt Boot retrieves data with 100% accuracy — without any complex post-processing.”

Conclusion
While scientists and engineers continue to identify vulnerabilities and develop security solutions, decisions about how much security to include in a design is an economic one. Cost vs. risk is a complex formula that depends on the end application, the impact of a breach, and the likelihood that an attack will occur.

“It’s like insurance,” said Kawaguchi. “Security engineers and people like us who are trying to promote security solutions get frustrated because, similar to insurance pitches, people respond with skepticism. ‘Why would I need it? That problem has never happened before.’ Engineers have a hard time convincing their managers to spend that extra dollar on the costs because of this ‘it-never-happened-before’ attitude. In the end, there are compromises. Yet ultimately, it’s going to cost manufacturers a lot of money when suddenly there’s a deluge of demands to fix this situation right away.”

References

  1. S. Skorobogatov, “Low temperature data remanence in static RAM”, Technical report UCAM-CL-TR-536, University of Cambridge Computer Laboratory, June 2002.
  2. Han, SJ., Han, JK., Yun, GJ. et al. Ultra-fast data sanitization of SRAM by back-biasing to resist a cold boot attack. Sci Rep 12, 35 (2022). https://doi.org/10.1038/s41598-021-03994-2

The post SRAM Security Concerns Grow appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • SRAM Security Concerns GrowKaren Heyman
    SRAM security concerns are intensifying as a combination of new and existing techniques allow hackers to tap into data for longer periods of time after a device is powered down. This is particularly alarming as the leading edge of design shifts from planar SoCs to heterogeneous systems in package, such as those used in AI or edge processing, where chiplets frequently have their own memory hierarchy. Until now, most cybersecurity concerns involving volatile memory have focused on DRAM, because it
     

SRAM Security Concerns Grow

9. Květen 2024 v 09:08

SRAM security concerns are intensifying as a combination of new and existing techniques allow hackers to tap into data for longer periods of time after a device is powered down.

This is particularly alarming as the leading edge of design shifts from planar SoCs to heterogeneous systems in package, such as those used in AI or edge processing, where chiplets frequently have their own memory hierarchy. Until now, most cybersecurity concerns involving volatile memory have focused on DRAM, because it is often external and easier to attack. SRAM, in contrast, does not contain a component as obviously vulnerable as a heat-sensitive capacitor, and in the past it has been harder to pinpoint. But as SoCs are disaggregated and more features are added into devices, SRAM is becoming a much bigger security concern.

The attack scheme is well understood. Known as cold boot, it was first identified in 2008, and is essentially a variant of a side-channel attack. In a cold boot approach, an attacker dumps data from internal SRAM to an external device, and then restarts the system from the external device with some code modification. “Cold boot is primarily targeted at SRAM, with the two primary defenses being isolation and in-memory encryption,” said Vijay Seshadri, distinguished engineer at Cycuity.

Compared with network-based attacks, such as DRAM’s rowhammer, cold boot is relatively simple. It relies on physical proximity and a can of compressed air.

The vulnerability was first described by Edward Felton, director of Princeton University’s Center for Information Technology Policy, J. Alex Halderman, currently director of the Center for Computer Security & Society at the University of Michigan, and colleagues. The breakthrough in their research was based on the growing realization in the engineering research community that data does not vanish from memory the moment a device is turned off, which until then was a common assumption. Instead, data in both DRAM and SRAM has a brief “remanence.”[1]

Using a cold boot approach, data can be retrieved, especially if an attacker sprays the chip with compressed air, cooling it enough to slow the degradation of the data. As the researchers described their approach, “We obtained surface temperatures of approximately −50°C with a simple cooling technique — discharging inverted cans of ‘canned air’ duster spray directly onto the chips. At these temperatures, we typically found that fewer than 1% of bits decayed even after 10 minutes without power.”

Unfortunately, despite nearly 20 years of security research since the publication of the Halderman paper, the authors’ warning still holds true. “Though we discuss several strategies for mitigating these risks, we know of no simple remedy that would eliminate them.”

However unrealistic, there is one simple and obvious remedy to cold boot — never leave a device unattended. But given human behavior, it’s safer to assume that every device is vulnerable, from smart watches to servers, as well as automotive chips used for increasingly autonomous driving.

While the original research exclusively examined DRAM, within the last six years cold boot has proven to be one of the most serious vulnerabilities for SRAM. In 2018, researchers at Germany’s Technische Universität Darmstadt published a paper describing a cold boot attack method that is highly resistant to memory erasure techniques, and which can be used to manipulate the cryptographic keys produced by the SRAM physical unclonable function (PUF).

As with so many security issues, it’s been a cat-and-mouse game between remedies and counter-attacks. And because cold boot takes advantage of slowing down memory degradation, in 2022 Yang-Kyu Choi and colleagues at the Korea Advanced Institute of Science and Technology (KAIST), described a way to undo the slowdown with an ultra-fast data sanitization method that worked within 5 ns, using back bias to control the device parameters of CMOS.

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Fig. 1: Asymmetric forward back-biasing scheme for permanent erasing. (a) All the data are reset to 1. (b) All the data are reset to 0. Whether all the data where reset to 1 or 0 is determined by the asymmetric forward back-biasing scheme. Source: KAIST/Creative Commons [2]

Their paper, as well as others, have inspired new approaches to combating cold boot attacks.

“To mitigate the risk of unauthorized access from unknown devices, main devices, or servers, check the authenticated code and unique identity of each accessing device,” said Jongsin Yun, memory technologist at Siemens EDA. “SRAM PUF is one of the ways to securely identify each device. SRAM is made of two inverters cross-coupled to each other. Although each inverter is designed to be the same device, normally one part of the inverter has a somewhat stronger NMOS than the other due to inherent random dopant fluctuation. During the initial power-on process, SRAM data will be either data 1 or 0, depending on which side has a stronger device. In other words, the initial data state of the SRAM array at the power on is decided by this unique random process variation and most of the bits maintain this property for life. One can use this unique pattern as a fingerprint of a device. The SRAM PUF data is reconstructed with other coded data to form a cryptographic key. SRAM PUF is a great way to anchor its secure data into hardware. Hackers may use a DFT circuit to access the memory. To avoid insecurely reading the SRAM information through DFT, the security-critical design makes DFT force delete the data as an initial process of TEST mode.”

However, there can be instances where data may be required to be kept in a non-volatile memory (NVM). “Data is considered insecure if the NVM is located outside of the device,” said Yun. “Therefore, secured data needs to be stored within the device with write protection. One-time programmable (OTP) memory or fuses are good storage options to prevent malicious attackers from tampering with the modified information. OTP memory and fuses are used to store cryptographic keys, authentication information, and other critical settings for operation within the device. It is useful for anti-rollback, which prevents hackers from exploiting old vulnerabilities that have been fixed in newer versions.”

Chiplet vulnerabilities
Chiplets also could present another vector for attack, due to their complexity and interconnections. “A chiplet has memory, so it’s going to be attacked,” said Cycuity’s Seshadri. “Chiplets, in general, are going to exacerbate the problem, rather than keeping it status quo, because you’re going to have one chiplet talking to another. Could an attack on one chiplet have a side effect on another? There need to be standards to address this. In fact, they’re coming into play already. A chiplet provider has to say, ‘Here’s what I’ve done for security. Here’s what needs to be done when interfacing with another chiplet.”

Yun notes there is a further physical vulnerability for those working with chiplets and SiPs. “When multiple chiplets are connected to form a SiP, we have to trust data coming from an external chip, which creates further complications. Verification of the chiplet’s authenticity becomes very important for SiPs, as there is a risk of malicious counterfeit chiplets being connected to the package for hacking purposes. Detection of such counterfeit chiplets is imperative.”

These precautions also apply when working with DRAM. In all situations, Seshardi said, thinking about security has to go beyond device-level protection. “The onus of protecting DRAM is not just on the DRAM designer or the memory designer,” he said. “It has to be secured by design principles when you are developing. In addition, you have to look at this holistically and do it at a system level. You must consider all the other things that communicate with DRAM or that are placed near DRAM. You must look at a holistic solution, all the way from software down to things like the memory controller and then finally, the DRAM itself.”

Encryption as a backup
Data itself always must be encrypted as second layer of protection against known and novel attacks, so an organization’s assets will still be protected even if someone breaks in via cold boot or another method.

“The first and primary method of preventing a cold boot attack is limiting physical access to the systems, or physically modifying the systems case or hardware preventing an attacker’s access,” said Jim Montgomery, market development director, semiconductor at TXOne Networks. “The most effective programmatic defense against an attack is to ensure encryption of memory using either a hardware- or software-based approach. Utilizing memory encryption will ensure that regardless of trying to dump the memory, or physically removing the memory, the encryption keys will remain secure.”

Montgomery also points out that TXOne is working with the Semiconductor Manufacturing Cybersecurity Consortium (SMCC) to develop common criteria based upon SEMI E187 and E188 standards to assist DM’s and OEM’s to implement secure procedures for systems security and integrity, including controlling the physical environment.

What kind and how much encryption will depend on use cases, said Jun Kawaguchi, global marketing executive for Winbond. “Encryption strength for a traffic signal controller is going to be different from encryption for nuclear plants or medical devices, critical applications where you need much higher levels,” he said. “There are different strengths and costs to it.”

Another problem, in the post-quantum era, is that encryption itself may be vulnerable. To defend against those possibilities, researchers are developing post-quantum encryption schemes. One way to stay a step ahead is homomorphic encryption [HE], which will find a role in data sharing, since computations can be performed on encrypted data without first having to decrypt it.

Homomorphic encryption could be in widespread use as soon as the next few years, according to Ronen Levy, senior manager for IBM’s Cloud Security & Privacy Technologies Department, and Omri Soceanu, AI Security Group manager at IBM.  However, there are still challenges to be overcome.

“There are three main inhibitors for widespread adoption of homomorphic encryption — performance, consumability, and standardization,” according to Levy. “The main inhibitor, by far, is performance. Homomorphic encryption comes with some latency and storage overheads. FHE hardware acceleration will be critical to solving these issues, as well as algorithmic and cryptographic solutions, but without the necessary expertise it can be quite challenging.”

An additional issue is that most consumers of HE technology, such as data scientists and application developers, do not possess deep cryptographic skills, HE solutions that are designed for cryptographers can be impractical. A few HE solutions require algorithmic and cryptographic expertise that inhibit adoption by those who lack these skills.

Finally, there is a lack of standardization. “Homomorphic encryption is in the process of being standardized,” said Soceanu. “But until it is fully standardized, large organizations may be hesitant to adopt a cryptographic solution that has not been approved by standardization bodies.”

Once these issues are resolved, they predicted widespread use as soon as the next few years. “Performance is already practical for a variety of use cases, and as hardware solutions for homomorphic encryption become a reality, more use cases would become practical,” said Levy. “Consumability is addressed by creating more solutions, making it easier and hopefully as frictionless as possible to move analytics to homomorphic encryption. Additionally, standardization efforts are already in progress.”

A new attack and an old problem
Unfortunately, security never will be as simple as making users more aware of their surroundings. Otherwise, cold boot could be completely eliminated as a threat. Instead, it’s essential to keep up with conference talks and the published literature, as graduate students keep probing SRAM for vulnerabilities, hopefully one step ahead of genuine attackers.

For example, SRAM-related cold boot attacks originally targeted discrete SRAM. The reason is that it’s far more complicated to attack on-chip SRAM, which is isolated from external probing and has minimal intrinsic capacitance. However, in 2022, Jubayer Mahmod, then a graduate student at Virginia Tech and his advisor, associate professor Matthew Hicks, demonstrated what they dubbed “Volt Boot,” a new method that could penetrate on-chip SRAM. According to their paper, “Volt Boot leverages asymmetrical power states (e.g., on vs. off) to force SRAM state retention across power cycles, eliminating the need for traditional cold boot attack enablers, such as low-temperature or intrinsic data retention time…Unlike other forms of SRAM data retention attacks, Volt Boot retrieves data with 100% accuracy — without any complex post-processing.”

Conclusion
While scientists and engineers continue to identify vulnerabilities and develop security solutions, decisions about how much security to include in a design is an economic one. Cost vs. risk is a complex formula that depends on the end application, the impact of a breach, and the likelihood that an attack will occur.

“It’s like insurance,” said Kawaguchi. “Security engineers and people like us who are trying to promote security solutions get frustrated because, similar to insurance pitches, people respond with skepticism. ‘Why would I need it? That problem has never happened before.’ Engineers have a hard time convincing their managers to spend that extra dollar on the costs because of this ‘it-never-happened-before’ attitude. In the end, there are compromises. Yet ultimately, it’s going to cost manufacturers a lot of money when suddenly there’s a deluge of demands to fix this situation right away.”

References

  1. S. Skorobogatov, “Low temperature data remanence in static RAM”, Technical report UCAM-CL-TR-536, University of Cambridge Computer Laboratory, June 2002.
  2. Han, SJ., Han, JK., Yun, GJ. et al. Ultra-fast data sanitization of SRAM by back-biasing to resist a cold boot attack. Sci Rep 12, 35 (2022). https://doi.org/10.1038/s41598-021-03994-2

The post SRAM Security Concerns Grow appeared first on Semiconductor Engineering.

❌
❌