FreshRSS

Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.
PředevčíremHlavní kanál
  • ✇Semiconductor Engineering
  • Ensure Reliability In Automotive ICs By Reducing Thermal EffectsLee Wang
    In the relentless pursuit of performance and miniaturization, the semiconductor industry has increasingly turned to 3D integrated circuits (3D-ICs) as a cutting-edge solution. Stacking dies in a 3D assembly offers numerous benefits, including enhanced performance, reduced power consumption, and more efficient use of space. However, this advanced technology also introduces significant thermal dissipation challenges that can impact the electrical behavior, reliability, performance, and lifespan of
     

Ensure Reliability In Automotive ICs By Reducing Thermal Effects

Od: Lee Wang
5. Srpen 2024 v 09:01

In the relentless pursuit of performance and miniaturization, the semiconductor industry has increasingly turned to 3D integrated circuits (3D-ICs) as a cutting-edge solution. Stacking dies in a 3D assembly offers numerous benefits, including enhanced performance, reduced power consumption, and more efficient use of space. However, this advanced technology also introduces significant thermal dissipation challenges that can impact the electrical behavior, reliability, performance, and lifespan of the chips (figure 1). For automotive applications, where safety and reliability are paramount, managing these thermal effects is of utmost importance.

Fig. 1: Illustration of a 3D-IC with heat dissipation.

3D-ICs have become particularly attractive for safety-critical devices like automotive sensors. Advanced driver-assistance systems (ADAS) and autonomous vehicles (AVs) rely on these compact, high-performance chips to process vast amounts of sensor data in real time. Effective thermal management in these devices is a top priority to ensure that they function reliably under various operating conditions.

The thermal challenges of 3D-ICs in automotive applications

The stacked configuration of 3D-ICs inherently leads to complex thermal dynamics. In traditional 2D designs, heat dissipation occurs across a single plane, making it relatively straightforward to manage. However, in 3D-ICs, multiple active layers generate heat, creating significant thermal gradients and hotspots. These thermal issues can adversely affect device performance and reliability, which is particularly critical in automotive applications where components must operate reliably under extreme temperatures and harsh conditions.

These thermal effects in automotive 3D-ICs can impact the electrical behavior of the circuits, causing timing errors, increased leakage currents, and potential device failure. Therefore, accurate and comprehensive thermal analysis throughout the design flow is essential to ensure the reliability and performance of automotive ICs.

The importance of early and continuous thermal analysis

Traditionally, thermal analysis has been performed at the package and system levels, often as a separate process from IC design. However, with the advent of 3D-ICs, this approach is no longer sufficient.

To address the thermal challenges of 3D-ICs for automotive applications, it is crucial to incorporate die-level thermal analysis early in the design process and continue it throughout the design flow (figure 2). Early-stage thermal analysis can help identify potential hotspots and thermal bottlenecks before they become critical issues, enabling designers to make informed decisions about chiplet placement, power distribution, and cooling strategies. These early decisions reduce the risks of thermal-induced failures, improving the reliability of 3D automotive ICs.

Fig. 2: Die-level detailed thermal analysis using accurate package and boundary conditions should be fully integrated into the ASIC design flow to allow for fast thermal exploration.

Early package design, floorplanning and thermal feasibility analysis

During the initial package design and floorplanning stage, designers can use high-level power estimates and simplified models to perform thermal feasibility studies. These early analyses help identify configurations that are likely to cause thermal problems, allowing designers to rule out problematic designs before investing significant time and resources in detailed implementation.

Fig. 3: Thermal analysis as part of the package design, floorplanning and implementation flows.

For example, thermal analysis can reveal issues such as overlapping heat sources in stacked dies or insufficient cooling paths. By identifying these problems early, designers can explore alternative floorplans and adjust power distribution to mitigate thermal risks. This proactive approach reduces the likelihood of encountering critical thermal issues late in the design process, thereby shortening the overall design cycle.

Iterative thermal analysis throughout design refinement

As the design progresses and more detailed information becomes available, thermal analysis should be performed iteratively to refine the thermal model and verify that the design remains within acceptable thermal limits. At each stage of design refinement, additional details such as power maps, layout geometries and their material properties can be incorporated into the thermal model to improve accuracy.

This iterative approach lets designers continuously monitor and address thermal issues, ensuring that the design evolves in a thermally aware manner. By integrating thermal analysis with other design verification tasks, such as timing and power analysis, designers can achieve a holistic view of the design’s performance and reliability.

A robust thermal analysis tool should support various stages of the design process, providing value from initial concept to final signoff:

  1. Early design planning: At the conceptual stage, designers can apply high-level power estimates to explore the thermal impact of different design options. This includes decisions related to 3D partitioning, die assembly, block and TSV floorplan, interface layer design, and package selection. By identifying potential thermal issues early, designers can make informed decisions that avoid costly redesigns later.
  2. Detailed design and implementation: As designs become more detailed, thermal analysis should be used to verify that the design stays within its thermal budget. This involves analyzing the maturing package and die layout representations to account for their impact on thermally sensitive electrical circuits. Fine-grained power maps are crucial at this stage to capture hotspot effects accurately.
  3. Design signoff: Before finalizing the design, it is essential to perform comprehensive thermal verification. This ensures that the design meets all thermal constraints and reliability requirements. Automated constraints checking and detailed reporting can expedite this process, providing designers with clear insights into any remaining thermal issues.
  4. Connection to package-system analysis: Models from IC-level thermal analysis can be used in thermal analysis of the package and system. The integration lets designers build a streamlined flow through the entire development process of a 3D electronic product.

Tools and techniques for accurate thermal analysis

To effectively manage thermal challenges in automotive ICs, designers need advanced tools and techniques that can provide accurate and fast thermal analysis throughout the design flow. Modern thermal analysis tools are equipped with capabilities to handle the complexity of 3D-IC designs, from early feasibility studies to final signoff.

High-fidelity thermal models

Accurate thermal analysis requires high-fidelity thermal models that capture the intricate details of the 3D-IC assembly. These models should account for non-uniform material properties, fine-grained power distributions, and the thermal impact of through-silicon vias (TSVs) and other 3D features. Advanced tools can generate detailed thermal models based on the actual design geometries, providing a realistic representation of heat flow and temperature distribution.

For instance, tools like Calibre 3DThermal embeds an optimized custom 3D solver from Simcenter Flotherm to perform precise thermal analysis down to the nanometer scale. By leveraging detailed layer information and accurate boundary conditions, these tools can produce reliable thermal models that reflect the true thermal behavior of the design.

Automation and results viewing

Automation is a key feature of modern thermal analysis tools, enabling designers to perform complex analyses without requiring deep expertise in thermal engineering. An effective thermal analysis tool must offer advanced automation to facilitate use by non-experts. Key automation features include:

  1. Optimized gridding: Automatically applying finer grids in critical areas of the model to ensure high resolution where needed, while using coarser grids elsewhere for efficiency.
  2. Time step automation: In transient analysis, smaller time steps can be automatically generated during power transitions to capture key impacts accurately.
  3. Equivalent thermal properties: Automatically reducing model complexity while maintaining accuracy by applying different bin sizes for critical (hotspot) vs non-critical regions when generating equivalent thermal properties.
  4. Power map compression: Using adaptive bin sizes to compress very large power maps to improve tool performance.
  1. Automated reporting: Generating summary reports that highlight key results for easy review and decision-making (figure 4).

Fig. 4: Ways to view thermal analysis results.

Automated thermal analysis tools can also integrate seamlessly with other design verification and implementation tools, providing a unified environment for managing thermal, electrical, and mechanical constraints. This integration ensures that thermal considerations are consistently addressed throughout the design flow, from initial feasibility analysis to final tape-out and even connecting with package-level analysis tools.

Real-world application

The practical benefits of integrated thermal analysis solutions are evident in real-world applications. For instance, a leading research organization, CEA, utilized an advanced thermal analysis tool from Siemens EDA to study the thermal performance of their 3DNoC demonstrator. The high-fidelity thermal model they developed showed a worst-case difference of just 3.75% and an average difference within 2% between simulation and measured data, demonstrating the accuracy and reliability of the tool (figure 5).

Fig. 5: Correlation of simulation versus measured results.

The path forward for automotive 3D-IC thermal management

As the automotive industry continues to embrace advanced technologies, the importance of accurate thermal analysis throughout the design flow of 3D-ICs cannot be overstated. By incorporating thermal analysis early in the design process and iteratively refining thermal models, designers can mitigate thermal risks, reduce design time, and enhance chip reliability.

Advanced thermal analysis tools that integrate seamlessly with the broader design environment are essential for achieving these goals. These tools enable designers to perform high-fidelity thermal analysis, automate complex tasks, and ensure that thermal considerations are addressed consistently from package design, through implementation to signoff.

By embracing these practices, designers can unlock the full potential of 3D-IC technology, delivering innovative, high-performance devices that meet the demands of today’s increasingly complex automotive applications.

For more information about die-level 3D-IC thermal analysis, read Conquer 3DIC thermal impacts with Calibre 3DThermal.

The post Ensure Reliability In Automotive ICs By Reducing Thermal Effects appeared first on Semiconductor Engineering.

A Practical Approach To Inline Memory Encryption And Confidential Computing For Enhanced Data Security

1. Srpen 2024 v 09:12

In today’s technology-driven landscape in which reducing TCO is top of mind, robust data protection is not merely an option but a necessity. As data, both personal and business-specific, is continuously exchanged, stored, and moved across various platforms and devices, the demand for a secure means of data aggregation and trust enhancement is escalating. Traditional data protection strategies of protecting data at rest and data in motion need to be complemented with protection of data in use. This is where the role of inline memory encryption (IME) becomes critical. It acts as a shield for data in use, thus underpinning confidential computing and ensuring data remains encrypted even when in use. This blog post will guide you through a practical approach to inline memory encryption and confidential computing for enhanced data security.

Understanding inline memory encryption and confidential computing

Inline memory encryption offers a smart answer to data security concerns, encrypting data for storage and decoding only during computation. This is the core concept of confidential computing. As modern applications on personal devices increasingly leverage cloud systems and services, data privacy and security become critical. Confidential computing and zero trust are suggested as solutions, providing data in use protection through hardware-based trusted execution environments. This strategy reduces trust reliance within any compute environment and shrinks hackers’ attack surface.

The security model of confidential computing demands data in use protection, in addition to traditional data at rest and data in motion protection. This usually pertains to data stored in off-chip memory such as DDR memory, regardless of it being volatile or non-volatile memory. Memory encryption suggests data should be encrypted before storage into memory, in either inline or look aside form. The performance demands of modern-day memories for encryption to align with the memory path have given rise to the term inline memory encryption.

There are multiple off-chip memory technologies, and modern NVMS and DDR memory performance requirements make inline encryption the most sensible solution. The XTS algorithm using AES or SM4 ciphers and the GCM algorithm are the commonly used cryptographic algorithms for memory encryption. While the XTS algorithm encrypts data solely for confidentiality, the GCM algorithm provides data encryption and data authentication, but requires extra memory space for metadata storage.

Inline memory encryption is utilized in many systems. When considering inline cipher performance for DDR, the memory performance required for different technologies is considered. For instance, LPDDR5 typically necessitates a data path bandwidth of 25 gigabytes per second. An AES operation involves 10 to 14 rounds of encryption rounds, implying that these 14 rounds must operate at the memory’s required bandwidth. This is achievable with correct pipelining in the crypto engine. Other considerations include minimizing read path latency, support for narrow bursts, memory specific features such as the number of outstanding transactions, data-hazard protection between READ and WRITE paths, and so on. Furthermore, side channel attack protections and data path integrity, a critical factor for robustness in advanced technology nodes, are additional concerns to be taken care without prohibitive PPA overhead.

Ensuring data security in AI and computational storage

The importance of data security is not limited to traditional computing areas but also extends to AI inference and training, which heavily rely on user data. Given the privacy issues and regulatory demands tied to user data, it’s essential to guarantee the encryption of this data, preventing unauthorized access. This necessitates the application of a trusted execution environment and data encryption whenever it’s transported outside this environment. With the advent of new algorithms that call for data sharing and model refinement among multiple parties, it’s crucial to maintain data privacy by implementing appropriate encryption algorithms.

Equally important is the rapidly evolving field of computational storage. The advent of new applications and increasing demands are pushing the boundaries of conventional storage architecture. Solutions such as flexible and composable storage, software-defined storage, and memory duplication and compression algorithms are being introduced to tackle these challenges. Yet, these solutions introduce security vulnerabilities as storage devices operate on raw disk data. To counter this, storage accelerators must be equipped with encryption and decryption capabilities and should manage operations at the storage nodes.

As our computing landscape continues to evolve, we need to address the escalating demand for robust data protection. Inline memory encryption emerges as a key solution, offering data in use protection for confidential computing, securing both personal and business data.

Rambus Inline Memory Encryption IP provide scalable, versatile, and high-performance inline memory encryption solutions that cater to a wide range of application requirements. The ICE-IP-338 is a FIPS certified inline cipher engine supporting AES and SM4, as well as XTS GCM modes of operation. Building on the ICE-IP-338 IP, the ICE-IP-339 provides essential AXI4 operation support, simplifying system integration for XTS operation, delivering confidentiality protection. The IME-IP-340 IP extends basic AXI4 support to narrow data access granularity, as well as AES GCM operations, delivering confidentiality and authentication. Finally, the most recent offering, the Rambus IME-IP-341 guarantees memory encryption with AES-XTS, while supporting the Arm v9 architecture specifications.

For more information, check out my recent IME webinar now available to watch on-demand.

The post A Practical Approach To Inline Memory Encryption And Confidential Computing For Enhanced Data Security appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Keeping Up With New ADAS And IVI SoC TrendsHezi Saar
    In the automotive industry, AI-enabled automotive devices and systems are dramatically transforming the way SoCs are designed, making high-quality and reliable die-to-die and chip-to-chip connectivity non-negotiable. This article explains how interface IP for die-to-die connectivity, display, and storage can support new developments in automotive SoCs for the most advanced innovations such as centralized zonal architecture and integrated ADAS and IVI applications. AI-integrated ADAS SoCs The aut
     

Keeping Up With New ADAS And IVI SoC Trends

Od: Hezi Saar
1. Srpen 2024 v 09:10

In the automotive industry, AI-enabled automotive devices and systems are dramatically transforming the way SoCs are designed, making high-quality and reliable die-to-die and chip-to-chip connectivity non-negotiable. This article explains how interface IP for die-to-die connectivity, display, and storage can support new developments in automotive SoCs for the most advanced innovations such as centralized zonal architecture and integrated ADAS and IVI applications.

AI-integrated ADAS SoCs

The automotive industry is adopting a new electronic/electric (EE) architecture where a centralized compute module executes multiple applications such as ADAS and in-vehicle infotainment (IVI). With the advent of EVs and more advanced features in the car, the new centralized zonal architecture will help minimize complexity, maximize scalability, and facilitate faster decision-making time. This new architecture is demanding a new set of SoCs on advanced process technologies with very high performance. More traditional monolithic SoCs for single functions like ADAS are giving way to multi-die designs where various dies are connected in a single package and placed in a system to perform a function in the car. While such multi-die designs are gaining adoption, semiconductor companies must remain cost-conscious as these ADAS SoCs will be manufactured at high volumes for a myriad of safety levels. One example is the automated driving central compute system. The system can include modules for the sensor interface, safety management, memory control and interfaces, and dies for CPU, GPU, and AI accelerator, which are then connected via a die-to-die interface such as the Universal Chiplet Interconnect Express (UCIe). Figure 1 illustrates how semiconductor companies can develop SoCs for such systems using multi-die designs. For a base ADAS or IVI SoC, the requirement might just be the CPU die for a level 2 functional safety. A GPU die can be added to the base CPU die for a base ADAS or premium IVI function at a level 2+ driving automation. To allow more compute power for AI workloads, an NPU die can be added to the base CPU or the base CPU and GPU dies for level 3/3+ functional safety. None of these scalable scenarios are possible without a solution for die-to-die connectivity.

Fig. 1: A simplified view of automotive systems using multi-die designs.

The adoption of UCIe for automotive SoCs

The industry has come together to define, develop, and deploy the UCIe standard, a universal interconnect at the package-level. In a recent news release, the UCIe Consortium announced “updates to the standard with additional enhancements for automotive usages – such as predictive failure analysis and health monitoring – and enabling lower-cost packaging implementations.” Figure 2 shows three use cases for UCIe. The first use case is for low-latency and coherency where two Network on a Chip (NoC) are connected via UCIe. This use case is mainly for applications requiring ADAS computing power. The second automotive use case is when memory and IO are split into two separate dies and are then connected to the compute die via CXL and UCIe streaming protocols. The third automotive use case is very similar to what is seen in HPC applications where a companion AI accelerator die is connected to the main CPU die via UCIe.

Fig. 2: Examples of common and new use cases for UCIe in automotive applications.

To enable such automotive use cases, UCIe offers several advantages, all of which are supported by the Synopsys UCIe IP:

  • Latency optimized architecture: Flit-Aware Die-to-Die Interface (FDI) or Raw Die-to-Die Interface (RDI) operate with local 2GHz system clock. Transmitter and receiver FIFOs accommodate phase mismatch between clock domains. There is no clock domain crossing (CDC) between the PHY and Adapter layers for minimum latency. The reference clock has the same frequency for the two dies.
  • Power-optimized architecture: The transmitter provides the CMOS driver without source termination. IT offers programmable drive strength without a Feed-Forward Equalizer (FFE). The receiver provides a continuous-time linear equalizer (CTLE) without VGA and decision feedback equalizer (DFE), clock forwarding without Clock and Data Recovery (CDR), and optional receiver termination.
  • Reliability and test: Signal integrity monitors track the performance of the interconnect through the chip’s lifecycle. This can monitor inaccessible paths in the multi-die package, test and repair the PHY, and execute real time reporting for preventative maintenance.

Synopsys UCIe IP is integrated with Synopsys 3DIC Compiler, a unified exploration-to-signoff platform. The combination eases package design and provides a complete set of IP deliverables, automated UCIe routing for better quality of results, and reference interposer design for faster integration.

Fig. 3: Synopsys 3DIC Compiler.

New automotive SoC design trends for IVI applications

OEMs are attracting consumers by providing the utmost in cockpit experience with high-resolution, 4K, pillar-to-pillar displays. Multi-Stream Transport (MTR) enables a daisy-chained display topology using a single port, which consists of a single GPU, one DP TX controller, and PHY, to display images on multiple screens in the car. This revision clarifies the components involved and maintains the original meaning. This daisy-chained set up simplifies the display wiring in the car. Figure 4 illustrates how connectivity in the SoC can enable multi-display environments in the car. Row 1: Multiple image sources from the application processor are fed into the daisy-chained display set up via the DisplayPort (DP) MTR interface. Row 2: Multiple image sources from the application processor are fed to the daisy-chained display set up but also to the left or right mirrors, all via the DP MTR interface. Row 3: The same set up in row 2 can be executed via the MIPI DSI or embedded DP MTR interfaces, depending on display size and power requirements.

An alternate use case is USB/DP. A single USB port can be used for silicon lifecycle management, sentry mode, test, debug, and firmware download. USB can be used to avoid the need for very large numbers of test pings, speed up test by exceeding GPIO test pin data rates, repeat manufacturing test in-system and in-field, access PVT monitors, and debug.

Fig. 4: Examples of display connectivity in software-defined vehicles.

ISO/SAE 21434 automotive cybersecurity

ISO/SAE 21434 Automotive Cybersecurity is being adopted by industry leaders as mandated by the UNECE R155 regulation. Starting in July 2024, automotive OEMs must comply with the UNECE R155 automotive cybersecurity regulation for all new vehicles in Europe, Japan, and Korea.

Automotive suppliers must develop processes that meet the automotive cybersecurity requirements of ISO/SAE 21434, addressing the cybersecurity perspective in the engineering of electrical and electronic (E/E) systems. Adopting this methodology involves embracing a cybersecurity culture which includes developing security plans, setting security goals, conducting engineering reviews and implementing mitigation strategies.

The industry is expected to move towards enabling cybersecurity risk-managed products to mitigate the risks associated with advancement in connectivity for software-defined vehicles. As a result, automotive IP needs to be ready to support these advancements.

Synopsys ARC HS4xFS Processor IP has achieved ISO/SAE 21434 cybersecurity certification by SGS-TṺV Saar, meeting stringent automotive regulatory requirements designed to protect connected vehicles from malicious cyberattacks. In addition, Synopsys has achieved certification of its IP development process to the ISO/SAE 21434 standard to help ensure its IP products are developed with a security-first mindset through every phase of the product development lifecycle.

Conclusion

The transformation to software-defined vehicles marks a significant shift in the automotive industry, bringing together highly integrated systems and AI to create safer and more efficient vehicles while addressing sophisticated user needs and vendor serviceability. New trends in the automotive industry are presenting opportunities for innovations in ADAS and IVI SoC designs. Centralized zonal architecture, multi-die design, daisy-chained displays, and integration of ADAS/IVI functions in a single SoC are among some of the key trends that the automotive industry is tracking. Synopsys is at the forefront of automotive SoC innovations with a portfolio of silicon-proven automotive IP for the highest levels of functional safety, security, quality, and reliability. The IP portfolio is developed and assessed specifically for ISO 26262 random hardware faults and ASIL D systematic. To minimize cybersecurity risks, Synopsys is developing IP products as per the ISO/SAE 21434 standard to provide automotive SoC developers a safe, reliable, and future proof solution.

The post Keeping Up With New ADAS And IVI SoC Trends appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • LPDDR Memory Is Key For On-Device AI PerformanceNidish Kamath
    Low-Power Double Data Rate (LPDDR) emerged as a specialized high performance, low power memory for mobile phones. Since its first release in 2006, each new generation of LPDDR has delivered the bandwidth and capacity needed for major shifts in the mobile user experience. Once again, LPDDR is at the forefront of another key shift as the next wave of generative AI applications will be built into our mobile phones and laptops. AI on endpoints is all about efficient inference. The process of employi
     

LPDDR Memory Is Key For On-Device AI Performance

24. Červen 2024 v 09:02

Low-Power Double Data Rate (LPDDR) emerged as a specialized high performance, low power memory for mobile phones. Since its first release in 2006, each new generation of LPDDR has delivered the bandwidth and capacity needed for major shifts in the mobile user experience. Once again, LPDDR is at the forefront of another key shift as the next wave of generative AI applications will be built into our mobile phones and laptops.

AI on endpoints is all about efficient inference. The process of employing trained AI models to make predictions or decisions requires specialized memory technologies with greater performance that are tailored to the unique demands of endpoint devices. Memory for AI inference on endpoints requires getting the right balance between bandwidth, capacity, power and compactness of form factor.

LPDDR evolved from DDR memory technology as a power-efficient alternative; LPDDR5, and the optional extension LPDDR5X, are the most recent updates to the standard. LPDDR5X is focused on improving performance, power, and flexibility; it offers data rates up to 8.533 Gbps, significantly boosting speed and performance. Compared to DDR5 memory, LPDDR5/5X limits the data bus width to 32 bits, while increasing the data rate. The switch to a quarter-speed clock, as compared to a half-speed clock in LPDDR4, along with a new feature – Dynamic Voltage Frequency Scaling – keeps the higher data rate LPDDR5 operation within the same thermal budget as LPDDR4-based devices.

Given the space considerations of mobiles, combined with greater memory needs for advanced applications, LPDDR5X can support capacities of up to 64GB by using multiple DRAM dies in a multi-die package. Consider the example of a 7B LLaMa 2 model: the model consumes 3.5GB of memory capacity if based on INT4. A LPDDR5X package of x64, with two LPDDR5X devices per package, provides an aggregate bandwidth of 68 GB/s and, therefore, a LLaMa 2 model can run inference at 19 tokens per second.

As demand for more memory performance grows, we see LPDDR5 evolve in the market with the major vendors announcing additional extensions to LPDDR5 known as LPDDR5T, with the T standing for turbo. LPDDR5T boosts performance to 9.6 Gbps enabling an aggregate bandwidth of 76.8 GB/s in a x64 package of multiple LPDDR5T stacks. Therefore, the above example of a 7B LLaMa 2 model can run inference at 21 tokens per second.

With its low power consumption and high bandwidth capabilities, LPDDR5 is a great choice of memory not just for cutting-edge mobile devices, but also for AI inference on endpoints where power efficiency and compact form factor are crucial considerations. Rambus offers a new LPDDR5T/5X/5 Controller IP that is fully optimized for use in applications requiring high memory throughput and low latency. The Rambus LPDDR5T/5X/5 Controller enables cutting-edge LPDDR5T memory devices and supports all third-party LPDDR5 PHYs. It maximizes bus bandwidth and minimizes latency via look-ahead command processing, bank management and auto-precharge. The controller can be delivered with additional cores such as the In-line ECC or Memory Analyzer cores to improve in-field reliability, availability and serviceability (RAS).

The post LPDDR Memory Is Key For On-Device AI Performance appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Addressing Quantum Computing Threats With SRAM PUFsRoel Maes
    You’ve probably been hearing a lot lately about the quantum-computing threat to cryptography. If so, you probably also have a lot of questions about what this “quantum threat” is and how it will impact your cryptographic solutions. Let’s take a look at some of the most common questions about quantum computing and its impact on cryptography. What is a quantum computer? A quantum computer is not a very fast general-purpose supercomputer, nor can it magically operate in a massively parallel manner.
     

Addressing Quantum Computing Threats With SRAM PUFs

Od: Roel Maes
6. Červen 2024 v 09:07

You’ve probably been hearing a lot lately about the quantum-computing threat to cryptography. If so, you probably also have a lot of questions about what this “quantum threat” is and how it will impact your cryptographic solutions. Let’s take a look at some of the most common questions about quantum computing and its impact on cryptography.

What is a quantum computer?

A quantum computer is not a very fast general-purpose supercomputer, nor can it magically operate in a massively parallel manner. Instead, it efficiently executes unique quantum algorithms. These algorithms can in theory perform certain very specific computations much more efficiently than any traditional computer could.

However, the development of a meaningful quantum computer, i.e., one that can in practice outperform a modern traditional computer, is exceptionally difficult. Quantum computing technology has been in development since the 1980s, with gradually improving operational quantum computers since the 2010s. However, even extrapolating the current state of the art into the future, and assuming an exponential improvement equivalent to Moore’s law for traditional computers, experts estimate that it will still take at least 15 to 20 years for a meaningful quantum computer to become a reality. 1, 2

What is the quantum threat to cryptography?

In the 1990s, it was discovered that some quantum algorithms can impact the security of certain traditional cryptographic techniques. Two quantum algorithms have raised concern:

  • Shor’s algorithm, invented in 1994 by Peter Shor, is an efficient quantum algorithm for factoring large integers, and for solving a few related number-theoretical problems. Currently, there are no known efficient-factoring algorithms for traditional computers, a fact that provides the basis of security for several classic public-key cryptographic techniques.
  • Grover’s algorithm, invented in 1996 by Lov Grover, is a quantum algorithm that can search for the inverse of a generic function quadratically faster than a traditional computer can. In cryptographic terms, searching for inverses is equivalent to a brute-force attack (e.g., on an unknown secret key value). The difficulty of such attacks forms the basis of security for most symmetric cryptography primitives.

These quantum algorithms, if they can be executed on a meaningful quantum computer, will impact the security of current cryptographic techniques.

What is the impact on public-key cryptography solutions?

By far the most important and most widely used public-key primitives today are based on RSA, discrete-logarithm, or elliptic curve cryptography. When meaningful quantum computers become operational, all of these can be efficiently solved by Shor’s algorithm. This will make virtually all public-key cryptography in current use insecure.

For the affected public-key encryption and key exchange primitives, this threat is already real today. An attacker capturing and storing encrypted messages exchanged now (or in the past), could decrypt them in the future when meaningful quantum computers are operational. So, highly sensitive and/or long-term secrets communicated up to today are already at risk.

If you use the affected signing primitives in short-term commitments of less than 15 years, the problem is less urgent. However, if meaningful quantum computers become available, the value of any signature will be voided from that point. So, you shouldn’t use the affected primitives for signing long-term commitments that still need to be verifiable in 15-20 years or more. This is already an issue for some use cases, e.g., for the security of secure boot and update solutions of embedded systems with a long lifetime.

Over the last decade, the cryptographic community has designed new public-key primitives that are based on mathematical problems that cannot be solved by Shor’s algorithm (or any other known efficient algorithm, quantum or otherwise). These algorithms are generally referred to as postquantum cryptography. NIST’s announcement on a selection of these algorithms for standardization1, after years of public scrutiny, is the latest culmination of that field-wide exercise. For protecting the firmware of embedded systems in the short term, the NSA recommends the use of existing post-quantum secure hash-based signature schemes12.

What is the impact on my symmetric cryptography solutions?

The security level of a well-designed symmetric key primitive is equivalent to the effort needed for brute-forcing the secret key. On a traditional computer, the effort of brute-forcing a secret key is directly exponential in the key’s length. When a meaningful quantum computer can be used, Grover’s algorithm can speed up the brute-force attack quadratically. The needed effort remains exponential, though only in half of the key’s length. So, Grover’s algorithm could be said to reduce the security of any given-length algorithm by 50%.

However, there are some important things to keep in mind:

  • Grover’s algorithm is an optimal brute-force strategy (quantum or otherwise),4so the quadratic speed-up is the worst-case security impact.
  • There are strong indications that it is not possible to meaningfully parallelize the execution of Grover’s algorithm.2,5,6,7In a traditional brute-force attack, doubling the number of computers used will cut the computation time in half. Such a scaling is not possible for Grover’s algorithm on a quantum computer, which makes its use in a brute-force attack very impractical.
  • Before Grover’s algorithm can be used to perform real-world brute-force attacks on 128-bit keys, the performance of quantum computers must improve tremendously. Very modern traditional supercomputers can barely perform computations with a complexity exponential in 128/2=64 bits in a practically feasible time (several months). Based on their current state and rate of progress, it will be much, much more than 20 years before quantum computers could be at that same level 6.

The practical impact of quantum computers on symmetric cryptography is, for the moment, very limited. Worst-case, the security strength of currently used primitives is reduced by 50% (of their key length), but due to the limitations of Grover’s algorithm, that is an overly pessimistic assumption for the near future. Doubling the length of symmetric keys to withstand quantum brute-force attacks is a very broad blanket measure that will certainly solve the problem, but is too conservative. Today, there are no mandated requirement for quantum-hardening symmetric-key cryptography, and 128-bit security strength primitives like AES-128 or SHA-256 are considered safe to use now. For the long-term, moving from 128-bit to 256-bit security strength algorithms is guaranteed to solve any foreseeable issues. 12

Is there an impact on information-theoretical security?

Information-theoretically secure methods (also called unconditional or perfect security) are algorithmic techniques for which security claims are mathematically proven. Some important information-theoretically secure constructions and primitives include the Vernam cipher, Shamir’s secret sharing, quantum key distribution8 (not to be confused with post-quantum cryptography), entropy sources and physical unclonable functions (PUFs), and fuzzy commitment schemes9.

Because an information-theoretical proof demonstrates that an adversary does not have sufficient information to break the security claim, regardless of its computing power – quantum or otherwise – information-theoretically secure constructions are not impacted by the quantum threat.

PUFs: An antidote for post-quantum security uncertainty

SRAM PUFs

The core technology underpinning all Synopsys products is an SRAM PUF. Like other PUFs, an SRAM PUF generates device-unique responses that stem from unpredictable variations originating in the production process of silicon chips. The operation of an SRAM PUF is based on a conventional SRAM circuit readily available in virtually all digital chips.

Based on years of continuous measurements and analysis, Synopsys has developed stochastic models that describe the behavior of its SRAM PUFs very accurately10. Using these models, we can determine tight bounds on the unpredictability of SRAM PUFs. These unpredictability bounds are expressed in terms of entropy, and are fundamental in nature, and cannot be overcome by any amount of computation, quantum or otherwise.

Synopsys PUF IP

Synopsys PUF IP is a security solution based on SRAM PUF technology. The central component of Synopsys PUF IP is a fuzzy commitment scheme9 that protects a root key with an SRAM PUF response and produces public helper data. It is information-theoretically proven that the helper data discloses zero information on the root key, so the fact that the helper data is public has no impact on the root key’s security.

Fig. 1: High-level architecture of Synopsys PUF IP.

This no-leakage proof – kept intact over years of field deployment on hundreds of millions of devices – relies on the PUF employed by the system to be an entropy source, as expressed by its stochastic model. Synopsys PUF IP uses its entropy source to initialize its root key for the very first time, which is subsequently protected by the fuzzy commitment scheme.

In addition to the fuzzy commitment scheme and the entropy source, Synopsys PUF IP also implements cryptographic operations based on certified standard-compliant constructions making use of standard symmetric crypto primitives, particularly AES and SHA-25611. These operations include:

  • a key derivation function (KDF) that uses the root key protected by the fuzzy commitment scheme as a key derivation key.
  • a deterministic random bit generator (DRBG) that is initially seeded by a high-entropy seed coming from the entropy source.
  • key wrapping functionality, essentially a form of authenticated encryption, for the protection of externally provided application keys using a key-wrapping key derived from the root key protected by the fuzzy commitment scheme.

Conclusion

The security architecture of Synopsys PUF IP is based on information-theoretically secure components for the generation and protection of a root key, and on established symmetric cryptography for other cryptographic functions. Information-theoretically secure constructions are impervious to quantum attacks. The impact of the quantum threat on symmetric cryptography is very limited and does not require any remediation now or in the foreseeable future. Importantly, Synopsys PUF IP does not deploy any quantum-vulnerable public-key cryptographic primitives.

All variants of Synopsys PUF IP are quantum-secure and in accordance with recommended post-quantum guidelines. The use of the 256-bit security strength variant of Synopsys PUF IP will offer strong quantum resistance, even in a distant future, but also the 128-bit variant is considered perfectly safe to use now and in the foreseeable time to come.

References

  1. Report on Post-Quantum Cryptography”, NIST Information Technology Laboratory, NISTIR 8105, April 2016,
  2. 2021 Quantum Threat Timeline Report”, Global Risk Institute (GRI), M. Mosca and M. Piani, January, 2022,
  3. PQC Standardization Process: Announcing Four Candidates to be Standardized, Plus Fourth Round Candidates”, NIST Information Technology Laboratory, July 5, 2022,
  4. “Grover’s quantum searching algorithm is optimal”, C. Zalka, Phys. Rev. A 60, 2746, October 1, 1999, https://journals.aps.org/pra/abstract/10.1103/PhysRevA.60.2746
  5. Reassessing Grover’s Algorithm”, S. Fluhrer, IACR ePrint 2017/811,
  6. NIST’s pleasant post-quantum surprise”, Bas Westerbaan, CloudFlare, July 8, 2022,
  7. Post-Quantum Cryptography – FAQs: To protect against the threat of quantum computers, should we double the key length for AES now? (added 11/18/18)”, NIST Information Technology Laboratory,
  8. Quantum cryptography: Public key distribution and coin tossing”, C. H. Bennett and G. Brassard, Proceedings of the IEEE International Conference on Computers, Systems and Signal Processing, December, 1984,
  9. A fuzzy commitment scheme”, A. Juels and M. Wattenberg, Proceedings of the 6th ACM conference on Computer and Communications Security, November, 1999,
  10. An Accurate Probabilistic Reliability Model for Silicon PUFs”, R. Maes, Proceedings of the International Workshop on Cryptographic Hardware and Embedded Systems, 2013,
  11. NIST Information Technology Laboratory, Cryptographic Algorithm Validation Program CAVP, validation #A2516, https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/details?validation=35127
  12. “Announcing the Commercial National Security Algorithm Suite 2.0”, National Security Agency, Cybersecurity Advisory https://media.defense.gov/2022/Sep/07/2003071834/-1/-1/0/CSA_CNSA_2.0_ALGORITHMS_.PDF

The post Addressing Quantum Computing Threats With SRAM PUFs appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • The Importance Of Memory Encryption For Protecting Data In UseEmma-Jane Crozier
    As systems-on-chips (SoCs) become increasingly complex, security functions must grow accordingly to protect the semiconductor devices themselves and the sensitive information residing on or passing through them. While a Root of Trust security solution built into the SoCs can protect the chip and data resident therein (data at rest), many other threats exist which target interception, theft or tampering with the valuable information in off-chip memory (data in use). Many isolation technologies ex
     

The Importance Of Memory Encryption For Protecting Data In Use

6. Červen 2024 v 09:06

As systems-on-chips (SoCs) become increasingly complex, security functions must grow accordingly to protect the semiconductor devices themselves and the sensitive information residing on or passing through them. While a Root of Trust security solution built into the SoCs can protect the chip and data resident therein (data at rest), many other threats exist which target interception, theft or tampering with the valuable information in off-chip memory (data in use).

Many isolation technologies exist for memory protection, however, with the discovery of the Meltdown and Spectre vulnerabilities in 2018, and attacks like row hammer targeting DRAM, security architects realize there are practical threats that can bypass these isolation technologies.

One of the techniques to prevent data being accessed across different guests/domains/zones/realms is memory encryption. With memory encryption in place, even if any of the isolation techniques have been compromised, the data being accessed is still protected by cryptography. To ensure the confidentiality of data, each user has their own protected key. Memory encryption can also prevent physical attacks like hardware bus probing on the DRAM bus interface. It can also prevent tampering with control plane information like the MPU/MMU control bits in DRAM and prevent the unauthorized movement of protected data within the DRAM.

Memory encryption technology must ensure confidentiality of the data. If a “lightweight” algorithm is used, there are no guarantees the data will be protected from mathematic cryptanalysts given that the amount of data used in memory encryption is typically huge. Well known, proven algorithms are either the NIST approved AES or OSCAA approved SM4 algorithms.

The recommended key length is also an important aspect defining the security strength. AES offers 128, 192 or 256-bit security, and SM4 offers 128-bit security. Advanced memory encryption technologies also involve integrity and protocol level anti-replay techniques for high-end use-cases. Proven hash algorithms like SHA-2, SHA-3, SM3 or (AES-)GHASH can be used for integrity protection purposes.

Once one or more of the cipher algorithms are selected, the choice of secure modes of operation must be made. Block Cipher algorithms need to be used in certain specific modes to encrypt bulk data larger than a single block of just 128 bits.

XTS mode, which stands for “XEX (Xor-Encrypt-Xor) with tweak and CTS (Cipher Text Stealing)” mode has been widely adopted for disk encryption. CTS is a clever technique which ensures the number of bytes in the encrypted payload is the same as the number of bytes in the plaintext payload. This is particularly important in storage to ensure the encrypted payload can fit in the same location as would the unencrypted version.

XTS/XEX uses two keys, one key for block encryption, and another key to process a “tweak.” The tweak ensures every block of memory is encrypted differently. Any changes in the plaintext result in a complete change of the ciphertext, preventing an attacker from obtaining any information about the plaintext.

While memory encryption is a critical aspect of security, there are many challenges to designing and implementing a secure memory encryption solution. Rambus is a leading provider of both memory and security technologies and understands the challenges from both the memory and security viewpoints. Rambus provides state-of-the-art Inline Memory Encryption (IME) IP that enables chip designers to build high-performance, secure, and scalable memory encryption solutions.

Additional information:

The post The Importance Of Memory Encryption For Protecting Data In Use appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Reset Domain Crossing VerificationSiemens EDA
    By Reetika and Sulabh Kumar Khare, Siemens EDA DI SW To meet low-power and high-performance requirements, system on chip (SoC) designs are equipped with several asynchronous and soft reset signals. These reset signals help to safeguard software and hardware functional safety as they can be asserted to speedily recover the system onboard to an initial state and clear any pending errors or events. By definition, a reset domain crossing (RDC) occurs when a path’s transmitting flop has an asynchrono
     

Reset Domain Crossing Verification

13. Květen 2024 v 09:01

By Reetika and Sulabh Kumar Khare, Siemens EDA DI SW

To meet low-power and high-performance requirements, system on chip (SoC) designs are equipped with several asynchronous and soft reset signals. These reset signals help to safeguard software and hardware functional safety as they can be asserted to speedily recover the system onboard to an initial state and clear any pending errors or events.

By definition, a reset domain crossing (RDC) occurs when a path’s transmitting flop has an asynchronous reset, and the receiving flop either has a different asynchronous reset than the transmitting flop or has no reset. The multitude of asynchronous reset sources found in today’s complex automotive designs means there are a large number of RDC paths, which can lead to systematic faults and hence cause data-corruption, glitches, metastability, or functional failures — along with other issues.

This issue is not covered by standard, static verification methods, such as clock domain crossing (CDC) analysis. Therefore, a proper reset domain crossing verification methodology is required to prevent errors in the reset design during the RTL verification stage.

A soft reset is an internally generated reset (register/latch/black-box output is used as a reset) that allows the design engineer to reset a specific portion of the design (specific module/subsystem) without affecting the entire system. Design engineers frequently use a soft reset mechanism to reset/restart the device without fully powering it off, as this helps to conserve power by selectively resetting specific electronic components while keeping others in an operational state. A soft reset typically involves manipulating specific registers or signals to trigger the reset process. Applying soft resets is a common technique used to quickly recover from a problem or test a specific area of the design. This can save time during simulation and verification by allowing the designer to isolate and debug specific issues without having to restart the entire simulation. Figure 1 shows a simple soft reset and its RTL to demonstrate that SoftReg is a soft reset for flop Reg.

Fig. 1: SoftReg is a soft reset for register Reg.

This article presents a systematic methodology to identify RDCs, with different soft resets, that are unsafe, even though the asynchronous reset domain is the same on the transmitter and receiver ends. Also, with enough debug aids, we will identify the safe RDCs (safe from metastability only if it meets the static timing analysis), with different asynchronous reset domains, that help to avoid silicon failures and minimize false crossing results. As a part of static analysis, this systematic methodology enables designers to intelligently identify critical reset domain bugs associated with soft resets.

A methodology to identify critical reset domain bugs

With highly complex reset architectures in automotive designs, there arises the need for a proper verification method to detect RDC issues. It is essential to detect unsafe RDCs systematically and apply appropriate synchronization techniques to tackle the issues that may arise due to delays in reset paths caused by soft resets. Thus designers can ensure proper operation of their designs and avoid the associated risks. By handling RDCs effectively, designers can mitigate potential issues and enhance the overall robustness and performance of a design. This systematic flow involves several steps to assist in RDC verification closure using standard RDC verification tools (see figure 2).

Fig. 2: Flowchart of methodology for RDC verification.

Specification of clock and reset signals

Signals that are intended to generate a clock and reset pulse should be specified by the user as clock or reset signals, respectively, during the set-up step in RDC verification. By specifying signals as clocks or resets (according to their expected behavior), designers can perform design rule checking and other verification checks to ensure compliance with clock and reset related guidelines and standards as well as best practices. This helps identify potential design issues and improve the overall quality of the design by reducing noise in the results.

Clock detection

Ideally, design engineers should define the clock signals and then the verification tool should trace these clocks down to the leaf clocks. Unfortunately, with complex designs, this is not possible as the design might have black boxes that originate clocks, or it may have some combinational logic in the clock signals that do not cover all the clocks specified by the user. All the un-specified clocks need to be identified and mapped to the user-specified primary clocks. An exhaustive detection of clocks is required in RDC verification, as potential metastability may occur if resets are used in different clock domains than the sequential element itself, leading to critical bugs.

Reset detection

Ideally, design engineers should define the reset signals, but again, due to the complexity of automotive and other modern designs, it is not possible to specify all the reset signals. Therefore a specialized verification tool is required for detection of resets. All the localized, black-box, gated, and primary resets need to be identified, and based on their usage in the RTL, they should be classified as synchronous, asynchronous, or dual type and then mapped to the user-specified primary resets.

Soft reset detection

The soft resets — i.e., the internally generated resets by flops and latches — need to be systematically detected as they can cause critical metastability issues when used in different clock domains, and they require static timing analysis when used in the same clock domain. Detecting soft resets helps identify potential metastability problems and allows designers to apply proper techniques for resolving these issues.

Reset tree analysis

Analysis of reset trees helps designers identify issues early in the design process, before RDC analysis. It helps to highlight some important errors in the reset design that are not commonly caught by lint tools. These include:

  • Dual synchronicity reset signals, i.e., the reset signal with a sample synchronous reset flop and a sample asynchronous reset flop
  • An asynchronous set/reset signal used as a data signal can result in incorrect data sampling because the reset state cannot be controlled

Reset domain crossing analysis

This step involves analyzing a design to determine the logic across various reset domains and identify potential RDCs. The analysis should also identify common reset sequences of asynchronous and soft reset sources at the transmitter and receiver registers of the crossings to avoid detection of false crossings that might appear as potential issues due to complex combinations of reset sources. False crossings are where a transmitter register and receiver register are asserted simultaneously due to dependencies among the reset assertion sequences, and as a result, any metastability that might occur on the receiver end is mitigated.

Analyze and fix RDC issues

The concluding step is to analyze the results of the verification steps to verify if data paths crossing reset domains are safe from metastability. For the RDCs identified as unsafe — which may occur either due to different asynchronous reset domains at the transmitter and receiver ends or due to the soft reset being used in a different clock domain than the sequential element itself — design engineers can develop solutions to eliminate or mitigate metastability by restructuring the design, modifying reset synchronization logic, or adjusting the reset ordering. Traditionally safe RDCs — i.e., crossings where a soft reset is used in the same clock domain as the sequential element itself — need to be verified using static timing analysis.

Figure 3 presents our proposed flow for identifying and eliminating metastability issues due to soft resets. After implementing the RDC solutions, re-verify the design to ensure that the reset domain crossing issues have been effectively addressed.

Fig. 3: Flowchart for proposed methodology to tackle metastability issues due to soft resets.

This methodology was used on a design with 374,546 register bits, 8 latch bits, and 45 RAMs. The Questa RDC verification tool using this new methodology identified around 131 reset domains, which consisted of 19 asynchronous domains defined by the user, as well as 81 asynchronous reset domains inferred by the tool.

The first run analyzed data paths crossing asynchronous reset domains without any soft reset analysis. It reported nearly 40,000 RDC crossings (as shown in table 1).

Reset domain crossings without soft reset analysis Severity Number of crossings
Reset domain crossing from a reset to a reset Violation 28408
Reset domain crossing from a reset to non-reset Violation 11235

Table 1: RDC analysis without soft resets.

In the second run, we did soft reset analysis and detected 34 soft resets, which resulted in additional violations for RDC paths with transmitter soft reset sources in different clock domains. These were critical violations that were missed in the initial run. Also, some RDC violations were converted to cautions (RDC paths with a transmitter soft reset in the same clock domain) as these paths would be safe from metastability as long as they meet the setup time window (as shown in table 2).

Reset domain crossings with soft reset analysis Severity Number of crossings
Reset domain crossing from a reset to a reset Violation 26957
Reset domain crossing from a reset to non-reset Violation 10523
Reset domain crossing with tx reset source in different clock Violation 880
Reset domain crossing from a reset to Rx with same clock Caution 2412

Table 2: RDC analysis with soft resets.

To gain a deeper understanding of RDC, metastability, and soft reset analysis in the context of this new methodology, please download the full paper Techniques to identify reset metastability issues due to soft resets.

The post Reset Domain Crossing Verification appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Securing AI In The Data CenterBart Stevens
    AI has permeated virtually every aspect of our digital lives, from personalized recommendations on streaming platforms to advanced medical diagnostics. Behind the scenes of this AI revolution lies the data center, which houses the hardware, software, and networking infrastructure necessary for training and deploying AI models. Securing AI in the data center relies on data confidentiality, integrity, and authenticity throughout the AI lifecycle, from data preprocessing to model training and infer
     

Securing AI In The Data Center

9. Květen 2024 v 09:07

AI has permeated virtually every aspect of our digital lives, from personalized recommendations on streaming platforms to advanced medical diagnostics. Behind the scenes of this AI revolution lies the data center, which houses the hardware, software, and networking infrastructure necessary for training and deploying AI models. Securing AI in the data center relies on data confidentiality, integrity, and authenticity throughout the AI lifecycle, from data preprocessing to model training and inference deployment.

High-value datasets containing sensitive information, such as personal health records or financial transactions, must be shielded from unauthorized access. Robust encryption mechanisms, such as Advanced Encryption Standard (AES), coupled with secure key management practices, form the foundation of data confidentiality in the data center. The encryption key used must be unique and used in a secure environment. Encryption and decryption operations of data are constantly occurring and must be performed to prevent key leakage. Should a compromise arise, it should be possible to renew the key securely and re-encrypt data with the new key.

The encryption key used must also be securely stored in a location that unauthorized processes or individuals cannot access. The keys used must be protected from attempts to read them from the device or attempts to steal them using side-channel techniques such as SCA (Side-Channel Attacks) or FIA (Fault Injection Attacks). The multitenancy aspect of modern data centers calls for robust SCA protection of key data.

Hardware-level security plays a pivotal role in safeguarding AI within the data center, offering built-in protections against a wide range of threats. Trusted Platform Modules (TPMs), secure enclaves, and Hardware Security Modules (HSMs) provide secure storage and processing environments for sensitive data and cryptographic keys, shielding them from unauthorized access or tampering. By leveraging hardware-based security features, organizations can enhance the resilience of their AI infrastructure and mitigate the risk of attacks targeting software vulnerabilities.

Ideally, secure cryptographic processing is handled by a Root of Trust core. The AI service provider manages the Root of Trust firmware, but it can also load secure applications that customers can write to implement their own cryptographic key management and storage applications. The Root of Trust can be integrated in the host CPU that orchestrates the AI operations, decrypting the AI model and its specific parameters before those are fed to AI or network accelerators (GPUs or NPUs). It can also be directly integrated with the GPUs and NPUs to perform encryption/decryption at that level. These GPUs and NPUs may also select to store AI workloads and inference models in encrypted form in their local memory banks and decrypt the data on the fly when access is required. Dedicated on-the-fly, low latency in-line memory decryption engines based on the AES-XTS algorithm can keep up with the memory bandwidth, ensuring that the process is not slowed down.

AI training workloads are often distributed among dozens of devices connected via PCIe or high-speed networking technology such as 800G Ethernet. An efficient confidentiality and integrity protocol such as MACsec using the AES-GCM algorithm can protect the data in motion over high-speed Ethernet links. AES-GCM engines integrated with the server SoC and the PCIe acceleration boards ensure that traffic is authenticated and optionally encrypted.

Rambus offers a broad portfolio of security IP covering the key security elements needed to protect AI in the data center. Rambus Root of Trust IP cores ensure a secure boot protocol that protects the integrity of its firmware. This can be combined with Rambus inline memory encryption engines, as well as dedicated solutions for MACsec up to 800G.

Resources

The post Securing AI In The Data Center appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Overcoming Chiplet Integration Challenges With AdaptabilityJayson Bethurem
    Chiplets are exploding in popularity due to key benefits such as lower cost, lower power, higher performance and greater flexibility to meet specific market requirements. More importantly, chiplets can reduce time-to-market, thus decreasing time-to-revenue! Heterogeneous and modular SoC design can accelerate innovation and adaptation for many companies. What’s not to like about chiplets? Well, as chiplets come to fruition, we are starting to realize many of the complications of chiplet designs.
     

Overcoming Chiplet Integration Challenges With Adaptability

9. Květen 2024 v 09:06

Chiplets are exploding in popularity due to key benefits such as lower cost, lower power, higher performance and greater flexibility to meet specific market requirements. More importantly, chiplets can reduce time-to-market, thus decreasing time-to-revenue! Heterogeneous and modular SoC design can accelerate innovation and adaptation for many companies. What’s not to like about chiplets? Well, as chiplets come to fruition, we are starting to realize many of the complications of chiplet designs.

Fig. 1: Expanded chiplet system view.

The interface challenge

The primary concept of chiplets is integration of ICs (Integrated Circuits) from multiple companies. However, many of these did not fully consider interoperation with other ICs. That’s partially based on a lack of interconnect standards around chiplets. Moreover, ICs have their own computational and bandwidth requirements. This is further complicated as competing interfaces standards vie for adoption as shown in table 1.

Standard d Throughput Density Max Delay
Advanced Interface Bus (Intel, AIB) 2 Gbps 504 Gbps/mm 5 ns
Bandwidth Engine 10.3 Gbps N/A 2.4 ns
BoW (Bunch of Wires) 16 Gbps 1280 Gbps/mm 5 ns
HBM3 (JEDEC) 4.8 Gbps N/A N/A
Infinity Fabric (AMD) 10.6 Gbps N/A 9 ns
Lipincon (TSMC) 2.8 Gbps 536 Gbps/mm 14ns
Multi-Die I/O (Intel) 5.4 Gbps 1600 Gbps/mm N/A
XSR/USR (Rambus) 112 Gbps N/A N/A
UCIe 32 Gbps 1350 Gbps/mm 2 ns

Table 1: Chiplet interconnect options.

Most chiplet interconnects are dominated by UCIe (Universal Chiplet Interconnect) and the unimaginatively named BoW (Bunch of Wires). UCIe introduced the 1.0 spec, and as with any first edition of a specification, it is inevitable updates will follow. UCIe 1.1 fixes several holes and gaps in 1.0. It addresses gray areas, missing definitions, ECNs and more. And it is very likely not the last update, as UCIe’s vision is to grow up the stack – adding additional protocol layers on top of the system layers.

Because of newness and expected evolution of UCIe and BoW protocols, designing them in is risky. Additionally, there will always be a place for multiple die-to-die interfaces, beyond UCIe. Specific use cases and designs will inherently be matched to different metrics leading for many designs to fall back to proprietary interfaces.

As you can see, there are many choices, and many of these have tradeoffs. Integration into a chiplet with a variety of these protocols would greatly benefit from adaptability via data/protocol adaptation that can easily be enabled with embedded programmable logic, or eFPGA. A lightweight protocol shim implemented in eFPGA IP can not only reformat data but also buffer data to maximize internal processing. Finally, consider that data between ICs in a chiplet can be globally asynchronous – another easy task resolved with eFPGA IP with FIFO synchronizers.

The security challenge

Beyond the interfaces, security is another emerging challenge. A few factors of chiplets must be cautiously considered:

  • Varying ICs from unknown and possibly unreputable manufacturers
  • IC can contain internal IP from additional third-party sources
  • Each IC may receive and introduce external data into the system

Naturally, this begs for attestation and provenance to ensure vendor confidence. As such, root of trust generally starts with the supply chain and auditing all vendors. However, it only takes one failed component, the least secure component, to jeopardize the entire system.

Root of trust suddenly becomes an issue and uncovers another issue. Which IC, or ICs, in the chiplet manage root of trust? As we’ve seen time and time again, security threats evolve at an alarming rate. But chiplets have an opportunity here. Again, embedded FPGAs have the flexible nature to adapt, thus thwarting these evolving security threats. eFPGA IP can also physically disable unused interfaces – minimizing surface attack vectors.

Adaptable cryptography cores can perform a variety of tasks with high performance in eFPGA IP. These tasks include authentication/digital signing, key generation, encapsulation/decapsulation, random number generation and much more. Further, post-quantum security cores that run very efficiently on eFPGA are becoming available. Figure 2 shows a ML Kyber Encapsulation Module from Xiphera that fits into only four Flex Logix EFLX tiles, efficiently packed at 98% utilization with a throughput of over 2 Gbps.

Fig. 2: ML-KEM IP core from Xiphera implemented on Flex Logix EFLX eFPGA IP.

Managing all data communication within a chiplet seems daunting; however, it is feasible. Designers have the choice of implementing eFPGA on every IC in the chiplet for adaptable data signage. Or standalone on the interposer, where system designers can define a secure enclave in which all data is authenticated and encrypted by an independent IC with eFPGA. eFPGA can also process streaming data at a very high rate. And in most cases can keep up with line rate, as seen with programmable data planes in SmartNICs.

eFPGA can add another critical security benefit. Every instance of eFPGA in the chiplet offers the ability to obfuscate critical algorithms, cryptography and protocols. This enables manufacturers to protect design secrets by not only programming these features in a controlled environment, but also adapting these as threats evolve.

The validation problem

Again, the absence of fully defined industry standards presents integration challenges. Conventional methods of qualification, testing, and validation become increasingly more complex. Yet this becomes another opportunity for eFPGA IP. It can be configured as an in-system diagnostic tool that provides testing, debugging and observability. Not only during IC bring up, but also during run time – eliminating finger pointing between independent companies.

The reconfigurability solution

While we’ve discussed a few different chiplet issues and solutions with adaptable eFPGA, it is important to realize that a singular instance of this IP can perform all these functions in a chiplet, as eFPGA IP is completely reconfigurable. It can be time-sliced and uniquely configured differently during specific operational phases of the chiplet. As mentioned in the examples above, during IC bring up it can provide insightful debug visibility into the system. During boot, it can enable secure boot and attested firmware updates to all ICs in the chiplet. During run time, it can perform cryptographic functions as well independently manage a secure enclave environment. eFPGA is also perfect for any other software acceleration your applications need, as its heavily parallel and pipelined nature is perfect for complex signal processing tasks. Lastly, during an RMA process it can also investigate and determine system failures. This is just a short list of the features eFPGA IP can enable in a chiplet.

Customizable for the perfect solution

Flex Logix EFLX IP delivers excellent PPA (Power, Performance and Area) and is available on the most advanced nodes, including Intel 18A and TSMC 7nm and 5nm. Furthermore, Flex Logix eFPGA IP is scalable – enabling you to choose the best balance of programmable logic, embedded memory and signal processing resources.

Fig. 3: Scalable Flex Logix eFPGA IP.

Want to learn more about Flex Logix IP? Contact us at info@flex-logix.com or visit our website https://flex-logix.com.

The post Overcoming Chiplet Integration Challenges With Adaptability appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Enhancing HMI Security: How To Protect ICS Environments From Cyber ThreatsJim Montgomery
    HMIs (Human Machine Interfaces) can be broadly defined as just about anything that allows humans to interface with their machines, and so are found throughout the technical world. In OT environments, operators use various HMIs to interact with industrial control systems in order to direct and monitor the operational systems. And wherever humans and machines intersect, security problems can ensue. Protecting HMI in cybersecurity plans, particularly in OT/ICS environments, can be a challenge, as H
     

Enhancing HMI Security: How To Protect ICS Environments From Cyber Threats

9. Květen 2024 v 09:05

HMIs (Human Machine Interfaces) can be broadly defined as just about anything that allows humans to interface with their machines, and so are found throughout the technical world. In OT environments, operators use various HMIs to interact with industrial control systems in order to direct and monitor the operational systems. And wherever humans and machines intersect, security problems can ensue.

Protecting HMI in cybersecurity plans, particularly in OT/ICS environments, can be a challenge, as HMIs offer a variety of vulnerabilities that threat actors can exploit to achieve any number of goals, from extortion to sabotage.

Consider the sort of OT environments HMIs are found in, including water and power utilities, manufacturing facilities, chemical production, oil and gas infrastructure, smart buildings, hospitals, and more. The HMIs in these environments offer bad actors a range of attack vectors through which they can enter and begin to wreak havoc, either financial, physical, or both.

What’s the relationship between HMI and SCADA?

SCADA (supervisory control and data acquisition) systems are used to acquire and analyze data and control industrial systems. Because of the role SCADA plays in these settings — generally overseeing the control of hugely complex, expensive, and dangerous-if-misused industrial equipment, processes, and facilities — they are extremely attractive to threat actors.

Unfortunately, the HMIs that operators use to interface with these systems may contain a number of vulnerabilities that are among the most highly exploitable and frequently breached vectors for attacks against SCADA systems.

Once an attacker gains access, they can seize from operators the ability to control the system. They can cause machinery to malfunction and suffer irreparable damage; they can taint products, steal information, and extort ransom. Even beyond ransom demands, the cost of production stoppages, lost sales, equipment replacement, and reputational damage can swallow some companies and create shortages in the market. Attacks can also cause equipment to perform in ways that threaten human life and safety.

Three types of HMIs in ICS that are vulnerable to attack

HMI security has to account for a range of “vulnerability options” available for exploitation by bad actors, such as keyboards, touch screens, and tablets, as well as more sophisticated interface points. Among the more frequently attacked are the Graphical User Interface and mobile and remote access.

Graphical User Interface

Attackers can use the Graphical User Interface or GUI to gain complete access to the system and manipulate it at will. They can often gain access by exploiting misconfigured access controls or bugs and other vulnerabilities that exist in a lot of software, including GUI software. If the system is web- or network-connected, their work is easier, especially if introducing malware is a goal. Once in, they can also move laterally, exploring or compromising interconnected systems and widening the attack.

Mobile and remote access

Even before COVID-19, mobile and remote access techniques were already being incorporated into managing a growing number of OT networks. When the pandemic hit hard, remote access often became a necessity. As the crisis faded, however, mobile and remote access became even more entrenched.

Remote access points are especially vulnerable. For one, remote access software can contain its own security vulnerabilities, like unpatched flaws and bugs or misconfigurations. Attackers may find openings in VPNs (virtual private networks) or RDP (remote desktop protocol) and use these holes to slip past security measures and carry out their mission.

Access controls

Attackers can compromise access control mechanisms to acquire the same permissions and privileges as authorized users, and once they gain access, they can do pretty much anything they want regarding system operations and data access. Access can be gained in many of the usual ways, such as an outdated VPN or stolen or purchased credentials. (Stolen or other credentials are readily available through online markets.)

The initial attack may just be a toe in the network while reconnaissance for holes in the access control system is conducted. Weak passwords, unnecessary access rights, and the usual misconfigurations and software vulnerabilities are all an attacker needs. As further walls are breached, attackers can then escalate their level of privilege to do whatever a legitimate user can do.

Understanding attack techniques in ICS HMI cybersecurity

Code injection

When attackers insert or inject malicious code into a software program or system, that’s code injection, and it can give the attacker access to core system functions. The resulting mayhem can include manipulation of control software, leading to shutdowns, equipment damage, and dangerous, even life-threatening situations if system changes result in hazardous chemical releases, changed formulas, explosions, or the misbehavior of large, heavy machinery. Code injections can corrupt, delete, or steal data and may result in compliance failure and fines in certain situations.

Malware virus infection

Malware can enter a network through various access points in addition to HMIs, even ones no one would ever expect, such as manufacturer-provided software updates or factory-fresh physical assets added to the production environment. A technician connecting a laptop or an employee plugging in a flash drive without knowing it’s infected will work just as well. As the walls between IT and OT thin, that attack surface widens as well. Once in the network, the attacker can escalate privileges, look around a bit, and see what’s worth doing or stealing. When enough has been learned, the attacker executes the malicious code, which can include ransomware or spyware. As in other attacks, operations can be interfered with, sometimes dangerously so.

Data tampering

Data tampering simply means that data is altered without authorization, including data used to operate, control, and monitor industrial systems. Attackers gain access through vulnerabilities in the system software or HMI devices or through passageways between IT and OT. Once in, they can explore the system to give themselves even greater access to more sensitive areas, where they can steal valuable and confidential system data, interrupt operations, compromise equipment, and damage the company’s business interests and competitive advantage.

Memory corruption

Memory corruption can happen in any computer network and may not represent anything nefarious. Yet memory corruption has also been used as an attack technique that can be deployed against OT networks and is thus potentially extremely damaging since data controls machinery, processes, formulas, and other essential functions. Attackers find software vulnerabilities in HMI or other access points through which the memory of an application or system can be reached and corrupted. This can lead to crashes, data leakage, denial of services (DoS), and even attacker takeovers of ICS and SCADA systems.

Spear phishing

Spear phishing attacks are generally launched against IT networks, which can then be used to open a corridor to the OT network. Spear phishing is basically a more targeted version of phishing attacks, in which an attacker will impersonate a legitimate, trusted source via email or web page, for example. In 2014, attackers targeted a German steel mill with an email suspected of carrying malicious code. They then used access to the business network to get to the SCADA/ICS network, where they modified the PLCs (programmable logic controllers) and took over the furnace’s operations. The physical damage they inflicted forced the plant to shut down.

DoS and DDoS attacks

Denial of Service (DoS) and Distributed Denial of Service (DDoS) work by overwhelming HMI points with excessive traffic or requests so they are unable to handle authorized control and monitoring functions. In 2016, some particularly vicious malware dubbed Industroyer (also Crashoveride) was deployed in an attack against Ukraine’s power grid and blacked out a substantial section of Kyiv. Industroyer was developed specifically to attack ICS and SCADA systems. The multipronged attack began by exploiting vulnerabilities in digital substation relays. A timer regulating the attack executed a distributed denial-of-service (DDoS) attack on every protection relay on the network that used any of four specific communication protocols. Simultaneously, it deleted all MicroSCADA-related files from the workstations’ hard drives. As the relays stopped functioning, lights went out across the city.

Exploiting remote access

The growing use of remote access to HMI systems during and after COVID-19 has provided threat actors with a wealth of newly available attack vectors. Less-than-airtight remote access security protocols make them very enticing for ICS-specific malware. HAVEX malware, for example, uses a remote access trojan (RAT) downloaded from OT vendor websites. The RAT can then scan for devices on the ports commonly used OT assets, collect information, and send it back to the attacker’s command and control server. A long-term attack used just such a method to gain remote access to energy networks in the U.S. and internationally, during which data thieves collected and “exfiltrated” (stole) enterprise and ICS-related data.

Credential theft

Obtaining unauthorized credentials is not all that difficult these days, with a robust online marketplace making it easier than ever. Phishing and spear phishing, malware, weak passwords, and vulnerabilities or misconfigurations that grant access to places where unencrypted credentials are all sources. With credentials in hand, attackers can move past security, including MFA (multifactor authentication), conduct reconnaissance, and give themselves whatever level of privilege they need to complete whatever their mission is. Or they simply persist and observe, learning all they can before finally acting against the ICS or SCADA system.

Zero-day attacks

Zero-day attacks got their name because they’re generally carried out against a previously existing yet unknown vulnerability; the vendor has zero days to fix it because the attack is already underway. Vulnerabilities that are completely unknown to either the software developer or the cybersecurity community exist throughout the software world, including in OT networks and their HMIs. Unsuspected and thus unpatched, they give fast-moving threat actors the opportunity to carry out a zero-day attack without resistance. The 2010 Stuxnet attack against Iran’s nuclear program used zero-day vulnerabilities in Windows to access the network and spread, eventually destroying the centrifuges. One thousand machines sustained physical damage.

Best practices for enhancing HMI security

Network segmentation for isolation

Network segmentation should be a core defense in securing industrial networks. Segmentation creates an environment that’s naturally resistant to intruders. Many of the attack techniques described above give attackers the ability to move laterally through the network. Segmenting the network prevents this lateral movement, limiting the attack radius and potential for damage. As OT networks become more connected to the world and the line between IT and OT continues to blur, network segmentation can segregate HMI systems from other parts of the network and the outside world. It can also segment defined zones within the OT network from each other so attacks can be contained.

Software and firmware updates

Software and firmware updates are recommended in all cybersecurity situations, but installing patches and updates in OT networks is easier said than done. OT networks prioritize continuous operations. There are compatibility issues, unpatchable legacy systems, and other roadblocks. The solution is virtual patching. Virtual patching is achieved by identifying all vulnerabilities within an OT network and applying a security mechanism such as a physical IPS (intrusion prevention system) or firewall. Rules are created, traffic is inspected and filtered, and attacks can be blocked and investigated.

Employee training on cybersecurity awareness

The more employees know about network operations, vulnerabilities, and cyberattack methods, the more they can do to help protect the network. Since few organizations have the internal staff to provide the necessary training, third-party training partners can be a viable solution. In any event, all employees should be trained in a company’s written policies, the general threat landscape, security best practices, how to handle physical assets like flash drives or laptops, how to recognize an attack, and what the company’s response protocol is. Specific training should be provided for employees who work remotely.

The evolving HMI security threat landscape

Concrete predictions about future threats and responses are hard to make, but the HMI security threat landscape will most likely evolve much the same way the entire security landscape will, with one major addition.

Air-gapped environments are going away

For a long time, many OT networks were air-gapped off from the world, physically and digitally isolated from the risks of contamination. Data and malware transfer alike required physical media, but inconvenience was safety. As OT networks continue to merge with the connected world, that kind of protection is going away. Remote work is becoming more prevalent, and the very connected IoT (Internet of Things) is now all over the automated factory floor. If wireless access points are left hanging from equipment, no one gives it a thought, except threat actors looking for a way in. (This is where basic employee training might help.)

Threat actors are innovators

Threat actors are becoming increasingly sophisticated. They devote much more time and thought to innovative ways to penetrate HMI and other OT network points than the people who operate them. AI and machine learning techniques are further empowering bad actors.

The statistics bear this out, especially as IT and OT networks continue to converge. In a study on 2023 OT/ICS cybersecurity activities, 76% of organizations were moving toward converged networks, and 97% reported IT security incidents also affected OT environments. Nearly half (47%) of businesses reported OT/ICS ransomware attacks, and 76% had significant concerns about state-sponsored actors.

On the positive side, however, pressure from regulators, insurance companies, and boards of directors is pushing organizations to think and act on cybersecurity for HMI points and throughout the network far more aggressively than many currently do. According to the study, 68% of organizations were increasing their budgets, 38% had dedicated OT security teams, and 77% had achieved a level-3 maturity in OT/ICS security.

Complete OT security

Cybersecurity in industrial environments presents challenges far different than those in IT networks. TXOne specializes in OT cybersecurity, with OT-native solutions designed for the equipment, environment, and day-to-day realities of industrial settings.

The post Enhancing HMI Security: How To Protect ICS Environments From Cyber Threats appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Earning Digital TrustNathalie Bijnens
    The internet of things (IoT) has been growing at a fast pace. In 2023, there were already double the number of internet-connected devices – 16 billion – than people on the planet. However, many of these devices are not properly secured. The high volume of insecure devices being deployed is presenting hackers with more opportunities than ever before. Governments around the world are realizing that additional security standards for IoT devices are needed to address the growing and important role o
     

Earning Digital Trust

9. Květen 2024 v 09:04

The internet of things (IoT) has been growing at a fast pace. In 2023, there were already double the number of internet-connected devices – 16 billion – than people on the planet. However, many of these devices are not properly secured. The high volume of insecure devices being deployed is presenting hackers with more opportunities than ever before. Governments around the world are realizing that additional security standards for IoT devices are needed to address the growing and important role of the billions of connected devices we rely on every day. The EU Cyber Resilience Act and the IoT Cybersecurity Improvement Act in the United States are driving improved security practices as well as an increased sense of urgency.

Digital trust is critical for the continued success of the IoT. This means that security, privacy, and reliability are becoming top concerns. IoT devices are always connected and can be deployed in any environment, which means that they can be attacked via the internet as well as physically in the field. Whether it is a remote attacker getting access to a baby monitor or camera inside your house, or someone physically tampering with sensors that are part of a critical infrastructure, IoT devices need to have proper security in place.

This is even more salient when one considers that each IoT device is part of a multi-party supply chain and is used in systems that contain many other devices. All these devices need to be trusted and communicate in a secure way to maintain the privacy of their data. It is critical to ensure that there are no backdoors left open by any link in the supply chain, or when devices are updated in the field. Any weak link exposes more than just the device in question to security breaches; it exposes its entire system – and the IoT itself – to attacks.

A foundation of trust starts in the hardware

To secure the IoT, each piece of silicon in the supply chain needs to be trusted. The best way to achieve this is by using a hardware-based root of trust (RoT) for every device. An RoT is typically defined as “the set of implicitly trusted functions that the rest of the system or device can use to ensure security.” The core of an RoT consists of an identity and cryptographic keys rooted in the hardware of a device. This establishes a unique, immutable, and unclonable identity to authenticate a device in the IoT network. It establishes the anchor point for the chain of trust, and powers critical system security use cases over the entire lifecycle of a device.

Protecting every device on the IoT with a hardware-based RoT can appear to be an unreachable goal. There are so many types of systems and devices and so many different semiconductor and device manufacturers, each with their own complex supply chain. Many of these chips and devices are high-volume/low-cost and therefore have strict constraints on additional manufacturing or supply chain costs for security. The PSA Certified 2023 Security Report indicates that 72% of tech decision makers are interested in the development of an industry-led set of guidelines to make reaching the goal of a secure IoT more attainable.

Security frameworks and certifications speed-up the process and build confidence

One important industry-led effort in standardizing IoT security that has been widely adopted is PSA Certified. PSA stands for Platform Security Architecture and PSA Certified is a global partnership addressing security challenges and uniting the technology ecosystem under a common security baseline, providing an easy-to consume and comprehensive methodology for the lab-validated assurance of device security. PSA Certified has been adopted by the full supply chain from silicon providers, software vendors, original equipment manufacturers (OEMs), IP providers, governments, content service providers (CSPs), insurance vendors and other third-party schemes. PSA Certified was the winner of the IoT Global Awards “Ecosystem of the year” in 2021.

PSA Certified lab-based evaluations (PSA Certified Level 2 and above) have a choice of evaluation methodologies, including the rigorous SESIP-based methodology (Security Evaluation Standard for IoT Platforms from GlobalPlatform), an optimized security evaluation methodology, designed for connected devices. PSA Certified recognizes that a myriad of different regulations and certification frameworks create an added layer of complexity for the silicon providers, OEMs, software vendors, developers, and service providers tasked with demonstrating the security capability of their products. The goal of the program is to provide a flexible and efficient security evaluation method needed to address the unique complexities and challenges of the evolving digital ecosystem and to drive consistency across device certification schemes to bring greater trust.

The PSA Certified framework recognizes the importance of a hardware RoT for every connected device. It currently provides incremental levels of certified assurance, ranging from a baseline Level 1 (application of best-practice security principles) to a more advanced Level 3 (validated protection against substantial hardware and software attacks).

PSA Certified RoT component

Among the certifications available, PSA Certified offers a PSA Certified RoT Component certification program, which targets separate RoT IP components, such as physical unclonable functions (PUFs), which use unclonable properties of silicon to create a robust trust (or security) anchor. As shown in figure 1, the PSA-RoT Certification includes three levels of security testing. These component-level certifications from PSA Certified validate specific security functional requirements (SFRs) provided by an RoT component and enable their reuse in a fast-track evaluation of a system integration using this component.

Fig. 1: PSA Certified establishes a chain of trust that begins with a PSA-RoT.

A proven RoT IP solution, now PSA Certified

Synopsys PUF IP is a secure key generation and storage solution that enables device manufacturers and designers to secure their products with internally generated unclonable identities and device-unique cryptographic keys. It uses the inherently random start-up values of SRAM as a physical unclonable function (PUF), which generates the entropy required for a strong hardware root of trust.

This root key created by Synopsys PUF IP is never stored, but rather recreated from the PUF upon each use, so there is never a key to be discovered by attackers. The root key is the basis for key management capabilities that enable each member of the supply chain to create its own secret keys, bound to the specific device, to protect their IP/communications without revealing these keys to any other member of the supply chain.

Synopsys PUF IP offers robust PUF-based physical security, with the following properties:

  • No secrets/keys at rest​ (no secrets stored in any memory)
    • prevents any attack on an unpowered device​
    • keys are only present when used​, limiting the window of opportunity for attacks
  • Hardware entropy source/root of trust​
    • no dependence on third parties​ (no key injection from outside)
    • no dependence on security of external components or other internal modules​
    • no dependence on software-based security​
  • Technology-independent, fully digital standard-logic CMOS IP
    • all fabs and technology nodes
    • small footprint
    • re-use in new platforms/deployments
  • Built-in error resilience​ due to advanced error-correction

The Synopsys PUF technology has been field-proven over more than a decade of deployment on over 750 million chips. And now, the Synopsys PUF has achieved the milestone of becoming the world’s first IP solution to be awarded “PSA Certified Level 3 RoT Component.” This certifies that the IP includes substantial protection against both software and hardware attacks (including side-channel and fault injection attacks) and is qualified as a trusted component in a system that requires PSA Level 3 certification.

Fault detection and other countermeasures

In addition to its PUF-related protection against physical attacks, all Synopsys PUF IP products have several built-in physical countermeasures. These include both systemic security features (such as data format validation, data authentication, key use restrictions, built in self-tests (BIST), and heath checks) as well as more specific countermeasures (such as data masking and dummy cycles) that protect against specific attacks.

The PSA Certified Synopsys PUF IP goes even one step further. It validates all inputs through integrity checks and error detection. It continuously asserts that everything runs as intended, flags any observed faults, and ensures security. Additionally, the PSA Certified Synopsys PUF IP provides hardware and software handholds to the user which assist in checking that all data is correctly transferred into and out of the PUF IP. The Synopsys PUF IP driver also supports fault detection and reporting.

Advantages of PUFs over traditional key injection and storage methods

For end-product developers, PUF IP has many advantages over traditional approaches for key management. These traditional approaches typically require key injection (provisioning secret keys into a device) and some form of non-volatile memory (NVM), such as embedded Flash memory or one-time programmable storage (OTP), where the programmed key is stored and where it needs to be protected from being extracted, overwritten, or changed. Unlike these traditional key injection solutions, Synopsys PUF IP does not require sensitive key handling by third parties, since PUF-based keys are created within the device itself. In addition, Synopsys PUF IP offers more flexibility than traditional solutions, as a virtually unlimited number of PUF-based keys can be created. And keys protected by the PUF can be added at any time in the lifecycle rather than only during manufacturing.

In terms of key storage, Synopsys PUF IP offers higher protection against physical attacks than storing keys in some form of NVM. PUF-based root keys are not stored on the device, but they are reconstructed upon each use, so there is nothing for attackers to find on the chip. Instead of storing keys in NVM, Synopsys PUF IP stores only (non-sensitive) helper data and encrypted keys in NVM on- or off-chip. The traditional approach of storing keys on the device in NVM is more vulnerable to physical attacks.

Finally, Synopsys PUF IP provides more portability. Since the Synopsys PUF IP is based on standard SRAM memory cells, it offers a process- and fab agnostic solution for key storage that scales to the most advanced technology nodes.

Conclusion

The large and steady increase in devices connected to the IoT also increases the need for digital trust and privacy. This requires flexible and efficient IoT security solutions that are standardized to streamline implementation and certification across the multiple players involved in the creation and deployment of IoT devices. The PSA Certified framework offers an easy-to-consume and comprehensive methodology for the lab-validated assurance of device security.

Synopsys PUF IP, which has been deployed in over 750 million chips, is the first-ever IP solution to be awarded “PSA Certified Level 3 RoT Component.” This certifies that the IP includes substantial protection against hardware and software attacks. Synopsys PUF IP offers IoT device makers a robust PUF-based security anchor with trusted industry-standard certification and offers the perfect balance between strong security, high flexibility, and low cost.

The post Earning Digital Trust appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Optimize Power For RF/μW Hybrid And Digital Phased ArraysKimia Azad
    Field-programmable gate arrays (FPGAs) are a critical component of both digital and hybrid phased array technology. Powering FPGAs for aerospace and defense applications comes with its own set of challenges, especially because these applications require higher reliability than many industrial or consumer technologies. This blog post will provide a brief history on beamforming and beam-steering technologies for defense applications, a guide on powering defense and space FPGAs, and a reflection on
     

Optimize Power For RF/μW Hybrid And Digital Phased Arrays

9. Květen 2024 v 09:03

Field-programmable gate arrays (FPGAs) are a critical component of both digital and hybrid phased array technology. Powering FPGAs for aerospace and defense applications comes with its own set of challenges, especially because these applications require higher reliability than many industrial or consumer technologies.

This blog post will provide a brief history on beamforming and beam-steering technologies for defense applications, a guide on powering defense and space FPGAs, and a reflection on the future of A&D communications systems.

Background

Active electronically scanned array (AESA) and phased array systems are more affordable and available than ever, namely in full digital and hybrid configuration. These systems can cover a wide RF/uW frequency spectrum and are suitable for use in RADAR and other military communications systems. AESA and phased array systems also boast advanced capabilities in beamforming and beam-steering technologies.

The emergence of 5G communications and commercial space data communications, coupled with advancements in semiconductor efficiency, has enabled companies to deploy innovative phased-array designs. However, roadblocks in efficiently powering these systems given watts consumed and dissipated, plus size and geometry constraints, can complicate development for designers.

Recent developments

Despite digital phased array technologies providing improved performance across a large part of the RF/uW frequency spectrum, their deployment is hindered by various factors. Cost barriers, power consumption, thermal constraints, latency concerns, and efficiency losses in the amplification and gain stages are some key hurdles.

Hybrid phased array is an ideal solution for meeting system level requirements without compromising on cost or power losses. This technology allows for less power consumption and reduced thermal concerns, paving the way for cost-saving solutions. The “digitizers” of hybrid phased array, such as FPGAs, are fewer and farther away from the antenna.

Powering FPGAs

FPGAs are valuable in their ability to perform extremely fast calculations in support of signal isolations, Fast Fourier Transformers (FFTs), I/Q data extractions, and in functions that form and steer radio-frequency beams. However, a key challenge in working with FPGAs lies in the necessity for consistent, sequenced power delivery.

Conclusion and future considerations

Complexities and challenges associated with deploying digital assets should not overlook industry advancements in powering FPGAs within phased array systems. With a focused approach, thermal management challenges can be mitigated, and performance optimized. Existing power solutions provide a strong and mature foundation for the on-going development and deployment of next-generation phased array systems within military and space applications. For a full guide on FPGA power considerations in aerospace and defense applications, take a look at our feature in Military Embedded Systems magazine.

The post Optimize Power For RF/μW Hybrid And Digital Phased Arrays appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • UCIe And Automotive Electronics: Pioneering The Chiplet RevolutionVinod Khera
    The automotive industry stands at the brink of a profound transformation fueled by the relentless march of technological innovation. Gone are the days of the traditional, one-size-fits-all system-on-chip (SoC) design framework. Today, we are witnessing a paradigm shift towards a more modular approach that utilizes diverse chiplets, each optimized for specific functionalities. This evolution promises to enhance the automotive system’s flexibility and efficiency and revolutionize how vehicles are
     

UCIe And Automotive Electronics: Pioneering The Chiplet Revolution

9. Květen 2024 v 09:02

The automotive industry stands at the brink of a profound transformation fueled by the relentless march of technological innovation. Gone are the days of the traditional, one-size-fits-all system-on-chip (SoC) design framework. Today, we are witnessing a paradigm shift towards a more modular approach that utilizes diverse chiplets, each optimized for specific functionalities. This evolution promises to enhance the automotive system’s flexibility and efficiency and revolutionize how vehicles are designed, built, and operated.

At the heart of this transformation lies the Universal Chiplet Interconnect Express (UCIe), a groundbreaking standard introduced in March 2022. UCIe is designed to drastically simplify the integration process across different chiplets from various manufacturers by standardizing die-to-die connections. This initiative caters to a critical need within the industry for a modular and scalable semiconductor architecture, thus setting the stage for unparalleled innovation in automotive electronics.

Understanding the core benefit of UCIe

UCIe isn’t merely about facilitating smoother communication between chiplets; it’s a visionary standard that ensures interoperability, reduces design complexity and costs, and, crucially, supports the seamless incorporation of chiplets into cohesive packages. Developed through the collaborative efforts of the UCIe consortium, this standard enables the use of chiplets from different vendors, thereby fostering a highly customizable and scalable solution. This mainly benefits the automotive sector, where high performance, reliability, and efficiency demand is paramount.

The pivotal role of UCIe in automotive electronics

Implementing UCIe within automotive electronics unlocks a plethora of advantages. Foremost among these is the ability to design more compact, powerful, and energy-efficient electronic systems. UCIe offers savings in energy per bit compared to other serial or parallel interfaces.

Given the increasing complexity and functionality of modern vehicles — which can be likened to data centers on wheels — the modular design principle of UCIe is invaluable. It facilitates the addition of new functionalities and ensures that automotive electronics can seamlessly adapt to and incorporate emerging technologies and standards.

Secondly, the adoption of UCIe marks a significant stride toward sustainable electronic design within the automotive industry. By promoting the reuse and integration of chiplets across different platforms and vehicle models, UCIe significantly mitigates electronic waste and streamlines the lifecycle management of electronic components. This benefits manufacturers regarding cost and efficiency and aligns with broader industry trends focused on sustainability and environmental stewardship.

Offerings and solutions

Cadence offers a range of Intellectual Properties (IPs), Electronic Design Automation (EDA) tools, and 3D Integrated Circuits (ICs) tailored for chiplets.

Since the beginning of UCIe, Cadence has engaged with over 100 customers and enables automotive solutions with UCIe and its working groups. The implementation features for automotive protect the mainband, and Cadence also adds protection for the sideband, particularly for the automotive industry. UCIe is being developed across multiple foundries, process nodes, and standard and advanced packaging enablement types.

Cadence has a history of chiplet experience with an interface IP portfolio, including our chiplet die-to-die interconnects, including the UCIe IP, and proprietary 40G UltraLink D2D PHY that enables SoC providers to deliver more customized solutions that offer higher performance and yields while also shortening development cycles and reducing costs through greater IP reuse. Cadence’s UCIe PHY solutions facilitate high-speed communication and interoperability among chiplets. Additionally, the Cadence Simulation VIP for UCIe ensures that systems using UCIe are reliable and performant, addressing one of the industry’s key concerns.

The Cadence UCIe PHY is a high-bandwidth, low-power, and low-latency die-to-die solution that enables multi-die system in package integration for high-performance compute, AI/ML, 5G, automotive and networking applications. The UCIe physical layer includes link initialization, training, power management states, lane mapping, lane reversal, and scrambling. The UCIe controller includes the die-to-die adapter layer and the protocol layer. The adapter layer ensures reliable transfer through link state management, protocol, and flit formats parameter negotiation. The UCIe architecture supports multiple standard protocols such as PCIe, CXL and streaming raw mode.

For interface IP, we usually create various test chips using different process technologies and measure the silicon in the lab to prove that our controller and PHY IPs fully comply with the standard. Hence, we designed a test chip with seven chiplets connected via UCIe over different interconnect distances. To learn more, click here.

The Cadence Verification IP (VIP) for Universal Chiplet Interconnect Express (UCIe) is designed for easy integration in test benches at the IP, system-on-chip (SoC), and system level. The VIP for UCIe runs on all simulators and supports SystemVerilog and the widely adopted Universal Verification Methodology (UVM). This enables verification teams to reduce the time spent on environment development and redirect it to cover a larger verification space, accelerate verification closure, and ensure end-product quality. With a layered architecture and powerful callback mechanism, verification engineers can verify UCIe features at each functional layer (PHY, D2D, Protocol) and create highly targeted designs while using the latest design methodologies for random testing to cover a larger verification space. The VIP for UCIe can be used as a standalone stack or layered with PCIe VIP.

Challenges and future prospects

The transition to chiplet-based designs and the widespread adoption of the UCIe standard are not without their challenges. Critical concerns include functional safety, reliability, quality, security, thermal management, and mechanical stress. Hence, ensuring dependable high-speed interconnects between chiplets necessitates ongoing research and development. Successful implementation also hinges on industry-wide collaboration and establishing a robust chiplet ecosystem that supports the UCIe standard.

UCIe stands as a pioneering innovation poised to redefine automotive electronics. By offering a scalable and flexible framework, UCIe promises to enhance in-vehicle electronic systems’ performance, efficiency, and versatility and marks a significant milestone in the automotive industry’s evolution. As we look to the future, the impact of UCIe in the automotive sector is poised to be profound, turning vehicles into dynamic platforms that continuously adapt to technological advancements and user needs. The realization of modular, customizable designs underscored by efficiency, performance, and innovation heralds the dawn of a new era in semiconductor development, one where the possibilities are as boundless as the technological horizon.

The post UCIe And Automotive Electronics: Pioneering The Chiplet Revolution appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Securing AI In The Data CenterBart Stevens
    AI has permeated virtually every aspect of our digital lives, from personalized recommendations on streaming platforms to advanced medical diagnostics. Behind the scenes of this AI revolution lies the data center, which houses the hardware, software, and networking infrastructure necessary for training and deploying AI models. Securing AI in the data center relies on data confidentiality, integrity, and authenticity throughout the AI lifecycle, from data preprocessing to model training and infer
     

Securing AI In The Data Center

9. Květen 2024 v 09:07

AI has permeated virtually every aspect of our digital lives, from personalized recommendations on streaming platforms to advanced medical diagnostics. Behind the scenes of this AI revolution lies the data center, which houses the hardware, software, and networking infrastructure necessary for training and deploying AI models. Securing AI in the data center relies on data confidentiality, integrity, and authenticity throughout the AI lifecycle, from data preprocessing to model training and inference deployment.

High-value datasets containing sensitive information, such as personal health records or financial transactions, must be shielded from unauthorized access. Robust encryption mechanisms, such as Advanced Encryption Standard (AES), coupled with secure key management practices, form the foundation of data confidentiality in the data center. The encryption key used must be unique and used in a secure environment. Encryption and decryption operations of data are constantly occurring and must be performed to prevent key leakage. Should a compromise arise, it should be possible to renew the key securely and re-encrypt data with the new key.

The encryption key used must also be securely stored in a location that unauthorized processes or individuals cannot access. The keys used must be protected from attempts to read them from the device or attempts to steal them using side-channel techniques such as SCA (Side-Channel Attacks) or FIA (Fault Injection Attacks). The multitenancy aspect of modern data centers calls for robust SCA protection of key data.

Hardware-level security plays a pivotal role in safeguarding AI within the data center, offering built-in protections against a wide range of threats. Trusted Platform Modules (TPMs), secure enclaves, and Hardware Security Modules (HSMs) provide secure storage and processing environments for sensitive data and cryptographic keys, shielding them from unauthorized access or tampering. By leveraging hardware-based security features, organizations can enhance the resilience of their AI infrastructure and mitigate the risk of attacks targeting software vulnerabilities.

Ideally, secure cryptographic processing is handled by a Root of Trust core. The AI service provider manages the Root of Trust firmware, but it can also load secure applications that customers can write to implement their own cryptographic key management and storage applications. The Root of Trust can be integrated in the host CPU that orchestrates the AI operations, decrypting the AI model and its specific parameters before those are fed to AI or network accelerators (GPUs or NPUs). It can also be directly integrated with the GPUs and NPUs to perform encryption/decryption at that level. These GPUs and NPUs may also select to store AI workloads and inference models in encrypted form in their local memory banks and decrypt the data on the fly when access is required. Dedicated on-the-fly, low latency in-line memory decryption engines based on the AES-XTS algorithm can keep up with the memory bandwidth, ensuring that the process is not slowed down.

AI training workloads are often distributed among dozens of devices connected via PCIe or high-speed networking technology such as 800G Ethernet. An efficient confidentiality and integrity protocol such as MACsec using the AES-GCM algorithm can protect the data in motion over high-speed Ethernet links. AES-GCM engines integrated with the server SoC and the PCIe acceleration boards ensure that traffic is authenticated and optionally encrypted.

Rambus offers a broad portfolio of security IP covering the key security elements needed to protect AI in the data center. Rambus Root of Trust IP cores ensure a secure boot protocol that protects the integrity of its firmware. This can be combined with Rambus inline memory encryption engines, as well as dedicated solutions for MACsec up to 800G.

Resources

The post Securing AI In The Data Center appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Overcoming Chiplet Integration Challenges With AdaptabilityJayson Bethurem
    Chiplets are exploding in popularity due to key benefits such as lower cost, lower power, higher performance and greater flexibility to meet specific market requirements. More importantly, chiplets can reduce time-to-market, thus decreasing time-to-revenue! Heterogeneous and modular SoC design can accelerate innovation and adaptation for many companies. What’s not to like about chiplets? Well, as chiplets come to fruition, we are starting to realize many of the complications of chiplet designs.
     

Overcoming Chiplet Integration Challenges With Adaptability

9. Květen 2024 v 09:06

Chiplets are exploding in popularity due to key benefits such as lower cost, lower power, higher performance and greater flexibility to meet specific market requirements. More importantly, chiplets can reduce time-to-market, thus decreasing time-to-revenue! Heterogeneous and modular SoC design can accelerate innovation and adaptation for many companies. What’s not to like about chiplets? Well, as chiplets come to fruition, we are starting to realize many of the complications of chiplet designs.

Fig. 1: Expanded chiplet system view.

The interface challenge

The primary concept of chiplets is integration of ICs (Integrated Circuits) from multiple companies. However, many of these did not fully consider interoperation with other ICs. That’s partially based on a lack of interconnect standards around chiplets. Moreover, ICs have their own computational and bandwidth requirements. This is further complicated as competing interfaces standards vie for adoption as shown in table 1.

Standard d Throughput Density Max Delay
Advanced Interface Bus (Intel, AIB) 2 Gbps 504 Gbps/mm 5 ns
Bandwidth Engine 10.3 Gbps N/A 2.4 ns
BoW (Bunch of Wires) 16 Gbps 1280 Gbps/mm 5 ns
HBM3 (JEDEC) 4.8 Gbps N/A N/A
Infinity Fabric (AMD) 10.6 Gbps N/A 9 ns
Lipincon (TSMC) 2.8 Gbps 536 Gbps/mm 14ns
Multi-Die I/O (Intel) 5.4 Gbps 1600 Gbps/mm N/A
XSR/USR (Rambus) 112 Gbps N/A N/A
UCIe 32 Gbps 1350 Gbps/mm 2 ns

Table 1: Chiplet interconnect options.

Most chiplet interconnects are dominated by UCIe (Universal Chiplet Interconnect) and the unimaginatively named BoW (Bunch of Wires). UCIe introduced the 1.0 spec, and as with any first edition of a specification, it is inevitable updates will follow. UCIe 1.1 fixes several holes and gaps in 1.0. It addresses gray areas, missing definitions, ECNs and more. And it is very likely not the last update, as UCIe’s vision is to grow up the stack – adding additional protocol layers on top of the system layers.

Because of newness and expected evolution of UCIe and BoW protocols, designing them in is risky. Additionally, there will always be a place for multiple die-to-die interfaces, beyond UCIe. Specific use cases and designs will inherently be matched to different metrics leading for many designs to fall back to proprietary interfaces.

As you can see, there are many choices, and many of these have tradeoffs. Integration into a chiplet with a variety of these protocols would greatly benefit from adaptability via data/protocol adaptation that can easily be enabled with embedded programmable logic, or eFPGA. A lightweight protocol shim implemented in eFPGA IP can not only reformat data but also buffer data to maximize internal processing. Finally, consider that data between ICs in a chiplet can be globally asynchronous – another easy task resolved with eFPGA IP with FIFO synchronizers.

The security challenge

Beyond the interfaces, security is another emerging challenge. A few factors of chiplets must be cautiously considered:

  • Varying ICs from unknown and possibly unreputable manufacturers
  • IC can contain internal IP from additional third-party sources
  • Each IC may receive and introduce external data into the system

Naturally, this begs for attestation and provenance to ensure vendor confidence. As such, root of trust generally starts with the supply chain and auditing all vendors. However, it only takes one failed component, the least secure component, to jeopardize the entire system.

Root of trust suddenly becomes an issue and uncovers another issue. Which IC, or ICs, in the chiplet manage root of trust? As we’ve seen time and time again, security threats evolve at an alarming rate. But chiplets have an opportunity here. Again, embedded FPGAs have the flexible nature to adapt, thus thwarting these evolving security threats. eFPGA IP can also physically disable unused interfaces – minimizing surface attack vectors.

Adaptable cryptography cores can perform a variety of tasks with high performance in eFPGA IP. These tasks include authentication/digital signing, key generation, encapsulation/decapsulation, random number generation and much more. Further, post-quantum security cores that run very efficiently on eFPGA are becoming available. Figure 2 shows a ML Kyber Encapsulation Module from Xiphera that fits into only four Flex Logix EFLX tiles, efficiently packed at 98% utilization with a throughput of over 2 Gbps.

Fig. 2: ML-KEM IP core from Xiphera implemented on Flex Logix EFLX eFPGA IP.

Managing all data communication within a chiplet seems daunting; however, it is feasible. Designers have the choice of implementing eFPGA on every IC in the chiplet for adaptable data signage. Or standalone on the interposer, where system designers can define a secure enclave in which all data is authenticated and encrypted by an independent IC with eFPGA. eFPGA can also process streaming data at a very high rate. And in most cases can keep up with line rate, as seen with programmable data planes in SmartNICs.

eFPGA can add another critical security benefit. Every instance of eFPGA in the chiplet offers the ability to obfuscate critical algorithms, cryptography and protocols. This enables manufacturers to protect design secrets by not only programming these features in a controlled environment, but also adapting these as threats evolve.

The validation problem

Again, the absence of fully defined industry standards presents integration challenges. Conventional methods of qualification, testing, and validation become increasingly more complex. Yet this becomes another opportunity for eFPGA IP. It can be configured as an in-system diagnostic tool that provides testing, debugging and observability. Not only during IC bring up, but also during run time – eliminating finger pointing between independent companies.

The reconfigurability solution

While we’ve discussed a few different chiplet issues and solutions with adaptable eFPGA, it is important to realize that a singular instance of this IP can perform all these functions in a chiplet, as eFPGA IP is completely reconfigurable. It can be time-sliced and uniquely configured differently during specific operational phases of the chiplet. As mentioned in the examples above, during IC bring up it can provide insightful debug visibility into the system. During boot, it can enable secure boot and attested firmware updates to all ICs in the chiplet. During run time, it can perform cryptographic functions as well independently manage a secure enclave environment. eFPGA is also perfect for any other software acceleration your applications need, as its heavily parallel and pipelined nature is perfect for complex signal processing tasks. Lastly, during an RMA process it can also investigate and determine system failures. This is just a short list of the features eFPGA IP can enable in a chiplet.

Customizable for the perfect solution

Flex Logix EFLX IP delivers excellent PPA (Power, Performance and Area) and is available on the most advanced nodes, including Intel 18A and TSMC 7nm and 5nm. Furthermore, Flex Logix eFPGA IP is scalable – enabling you to choose the best balance of programmable logic, embedded memory and signal processing resources.

Fig. 3: Scalable Flex Logix eFPGA IP.

Want to learn more about Flex Logix IP? Contact us at info@flex-logix.com or visit our website https://flex-logix.com.

The post Overcoming Chiplet Integration Challenges With Adaptability appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Enhancing HMI Security: How To Protect ICS Environments From Cyber ThreatsJim Montgomery
    HMIs (Human Machine Interfaces) can be broadly defined as just about anything that allows humans to interface with their machines, and so are found throughout the technical world. In OT environments, operators use various HMIs to interact with industrial control systems in order to direct and monitor the operational systems. And wherever humans and machines intersect, security problems can ensue. Protecting HMI in cybersecurity plans, particularly in OT/ICS environments, can be a challenge, as H
     

Enhancing HMI Security: How To Protect ICS Environments From Cyber Threats

9. Květen 2024 v 09:05

HMIs (Human Machine Interfaces) can be broadly defined as just about anything that allows humans to interface with their machines, and so are found throughout the technical world. In OT environments, operators use various HMIs to interact with industrial control systems in order to direct and monitor the operational systems. And wherever humans and machines intersect, security problems can ensue.

Protecting HMI in cybersecurity plans, particularly in OT/ICS environments, can be a challenge, as HMIs offer a variety of vulnerabilities that threat actors can exploit to achieve any number of goals, from extortion to sabotage.

Consider the sort of OT environments HMIs are found in, including water and power utilities, manufacturing facilities, chemical production, oil and gas infrastructure, smart buildings, hospitals, and more. The HMIs in these environments offer bad actors a range of attack vectors through which they can enter and begin to wreak havoc, either financial, physical, or both.

What’s the relationship between HMI and SCADA?

SCADA (supervisory control and data acquisition) systems are used to acquire and analyze data and control industrial systems. Because of the role SCADA plays in these settings — generally overseeing the control of hugely complex, expensive, and dangerous-if-misused industrial equipment, processes, and facilities — they are extremely attractive to threat actors.

Unfortunately, the HMIs that operators use to interface with these systems may contain a number of vulnerabilities that are among the most highly exploitable and frequently breached vectors for attacks against SCADA systems.

Once an attacker gains access, they can seize from operators the ability to control the system. They can cause machinery to malfunction and suffer irreparable damage; they can taint products, steal information, and extort ransom. Even beyond ransom demands, the cost of production stoppages, lost sales, equipment replacement, and reputational damage can swallow some companies and create shortages in the market. Attacks can also cause equipment to perform in ways that threaten human life and safety.

Three types of HMIs in ICS that are vulnerable to attack

HMI security has to account for a range of “vulnerability options” available for exploitation by bad actors, such as keyboards, touch screens, and tablets, as well as more sophisticated interface points. Among the more frequently attacked are the Graphical User Interface and mobile and remote access.

Graphical User Interface

Attackers can use the Graphical User Interface or GUI to gain complete access to the system and manipulate it at will. They can often gain access by exploiting misconfigured access controls or bugs and other vulnerabilities that exist in a lot of software, including GUI software. If the system is web- or network-connected, their work is easier, especially if introducing malware is a goal. Once in, they can also move laterally, exploring or compromising interconnected systems and widening the attack.

Mobile and remote access

Even before COVID-19, mobile and remote access techniques were already being incorporated into managing a growing number of OT networks. When the pandemic hit hard, remote access often became a necessity. As the crisis faded, however, mobile and remote access became even more entrenched.

Remote access points are especially vulnerable. For one, remote access software can contain its own security vulnerabilities, like unpatched flaws and bugs or misconfigurations. Attackers may find openings in VPNs (virtual private networks) or RDP (remote desktop protocol) and use these holes to slip past security measures and carry out their mission.

Access controls

Attackers can compromise access control mechanisms to acquire the same permissions and privileges as authorized users, and once they gain access, they can do pretty much anything they want regarding system operations and data access. Access can be gained in many of the usual ways, such as an outdated VPN or stolen or purchased credentials. (Stolen or other credentials are readily available through online markets.)

The initial attack may just be a toe in the network while reconnaissance for holes in the access control system is conducted. Weak passwords, unnecessary access rights, and the usual misconfigurations and software vulnerabilities are all an attacker needs. As further walls are breached, attackers can then escalate their level of privilege to do whatever a legitimate user can do.

Understanding attack techniques in ICS HMI cybersecurity

Code injection

When attackers insert or inject malicious code into a software program or system, that’s code injection, and it can give the attacker access to core system functions. The resulting mayhem can include manipulation of control software, leading to shutdowns, equipment damage, and dangerous, even life-threatening situations if system changes result in hazardous chemical releases, changed formulas, explosions, or the misbehavior of large, heavy machinery. Code injections can corrupt, delete, or steal data and may result in compliance failure and fines in certain situations.

Malware virus infection

Malware can enter a network through various access points in addition to HMIs, even ones no one would ever expect, such as manufacturer-provided software updates or factory-fresh physical assets added to the production environment. A technician connecting a laptop or an employee plugging in a flash drive without knowing it’s infected will work just as well. As the walls between IT and OT thin, that attack surface widens as well. Once in the network, the attacker can escalate privileges, look around a bit, and see what’s worth doing or stealing. When enough has been learned, the attacker executes the malicious code, which can include ransomware or spyware. As in other attacks, operations can be interfered with, sometimes dangerously so.

Data tampering

Data tampering simply means that data is altered without authorization, including data used to operate, control, and monitor industrial systems. Attackers gain access through vulnerabilities in the system software or HMI devices or through passageways between IT and OT. Once in, they can explore the system to give themselves even greater access to more sensitive areas, where they can steal valuable and confidential system data, interrupt operations, compromise equipment, and damage the company’s business interests and competitive advantage.

Memory corruption

Memory corruption can happen in any computer network and may not represent anything nefarious. Yet memory corruption has also been used as an attack technique that can be deployed against OT networks and is thus potentially extremely damaging since data controls machinery, processes, formulas, and other essential functions. Attackers find software vulnerabilities in HMI or other access points through which the memory of an application or system can be reached and corrupted. This can lead to crashes, data leakage, denial of services (DoS), and even attacker takeovers of ICS and SCADA systems.

Spear phishing

Spear phishing attacks are generally launched against IT networks, which can then be used to open a corridor to the OT network. Spear phishing is basically a more targeted version of phishing attacks, in which an attacker will impersonate a legitimate, trusted source via email or web page, for example. In 2014, attackers targeted a German steel mill with an email suspected of carrying malicious code. They then used access to the business network to get to the SCADA/ICS network, where they modified the PLCs (programmable logic controllers) and took over the furnace’s operations. The physical damage they inflicted forced the plant to shut down.

DoS and DDoS attacks

Denial of Service (DoS) and Distributed Denial of Service (DDoS) work by overwhelming HMI points with excessive traffic or requests so they are unable to handle authorized control and monitoring functions. In 2016, some particularly vicious malware dubbed Industroyer (also Crashoveride) was deployed in an attack against Ukraine’s power grid and blacked out a substantial section of Kyiv. Industroyer was developed specifically to attack ICS and SCADA systems. The multipronged attack began by exploiting vulnerabilities in digital substation relays. A timer regulating the attack executed a distributed denial-of-service (DDoS) attack on every protection relay on the network that used any of four specific communication protocols. Simultaneously, it deleted all MicroSCADA-related files from the workstations’ hard drives. As the relays stopped functioning, lights went out across the city.

Exploiting remote access

The growing use of remote access to HMI systems during and after COVID-19 has provided threat actors with a wealth of newly available attack vectors. Less-than-airtight remote access security protocols make them very enticing for ICS-specific malware. HAVEX malware, for example, uses a remote access trojan (RAT) downloaded from OT vendor websites. The RAT can then scan for devices on the ports commonly used OT assets, collect information, and send it back to the attacker’s command and control server. A long-term attack used just such a method to gain remote access to energy networks in the U.S. and internationally, during which data thieves collected and “exfiltrated” (stole) enterprise and ICS-related data.

Credential theft

Obtaining unauthorized credentials is not all that difficult these days, with a robust online marketplace making it easier than ever. Phishing and spear phishing, malware, weak passwords, and vulnerabilities or misconfigurations that grant access to places where unencrypted credentials are all sources. With credentials in hand, attackers can move past security, including MFA (multifactor authentication), conduct reconnaissance, and give themselves whatever level of privilege they need to complete whatever their mission is. Or they simply persist and observe, learning all they can before finally acting against the ICS or SCADA system.

Zero-day attacks

Zero-day attacks got their name because they’re generally carried out against a previously existing yet unknown vulnerability; the vendor has zero days to fix it because the attack is already underway. Vulnerabilities that are completely unknown to either the software developer or the cybersecurity community exist throughout the software world, including in OT networks and their HMIs. Unsuspected and thus unpatched, they give fast-moving threat actors the opportunity to carry out a zero-day attack without resistance. The 2010 Stuxnet attack against Iran’s nuclear program used zero-day vulnerabilities in Windows to access the network and spread, eventually destroying the centrifuges. One thousand machines sustained physical damage.

Best practices for enhancing HMI security

Network segmentation for isolation

Network segmentation should be a core defense in securing industrial networks. Segmentation creates an environment that’s naturally resistant to intruders. Many of the attack techniques described above give attackers the ability to move laterally through the network. Segmenting the network prevents this lateral movement, limiting the attack radius and potential for damage. As OT networks become more connected to the world and the line between IT and OT continues to blur, network segmentation can segregate HMI systems from other parts of the network and the outside world. It can also segment defined zones within the OT network from each other so attacks can be contained.

Software and firmware updates

Software and firmware updates are recommended in all cybersecurity situations, but installing patches and updates in OT networks is easier said than done. OT networks prioritize continuous operations. There are compatibility issues, unpatchable legacy systems, and other roadblocks. The solution is virtual patching. Virtual patching is achieved by identifying all vulnerabilities within an OT network and applying a security mechanism such as a physical IPS (intrusion prevention system) or firewall. Rules are created, traffic is inspected and filtered, and attacks can be blocked and investigated.

Employee training on cybersecurity awareness

The more employees know about network operations, vulnerabilities, and cyberattack methods, the more they can do to help protect the network. Since few organizations have the internal staff to provide the necessary training, third-party training partners can be a viable solution. In any event, all employees should be trained in a company’s written policies, the general threat landscape, security best practices, how to handle physical assets like flash drives or laptops, how to recognize an attack, and what the company’s response protocol is. Specific training should be provided for employees who work remotely.

The evolving HMI security threat landscape

Concrete predictions about future threats and responses are hard to make, but the HMI security threat landscape will most likely evolve much the same way the entire security landscape will, with one major addition.

Air-gapped environments are going away

For a long time, many OT networks were air-gapped off from the world, physically and digitally isolated from the risks of contamination. Data and malware transfer alike required physical media, but inconvenience was safety. As OT networks continue to merge with the connected world, that kind of protection is going away. Remote work is becoming more prevalent, and the very connected IoT (Internet of Things) is now all over the automated factory floor. If wireless access points are left hanging from equipment, no one gives it a thought, except threat actors looking for a way in. (This is where basic employee training might help.)

Threat actors are innovators

Threat actors are becoming increasingly sophisticated. They devote much more time and thought to innovative ways to penetrate HMI and other OT network points than the people who operate them. AI and machine learning techniques are further empowering bad actors.

The statistics bear this out, especially as IT and OT networks continue to converge. In a study on 2023 OT/ICS cybersecurity activities, 76% of organizations were moving toward converged networks, and 97% reported IT security incidents also affected OT environments. Nearly half (47%) of businesses reported OT/ICS ransomware attacks, and 76% had significant concerns about state-sponsored actors.

On the positive side, however, pressure from regulators, insurance companies, and boards of directors is pushing organizations to think and act on cybersecurity for HMI points and throughout the network far more aggressively than many currently do. According to the study, 68% of organizations were increasing their budgets, 38% had dedicated OT security teams, and 77% had achieved a level-3 maturity in OT/ICS security.

Complete OT security

Cybersecurity in industrial environments presents challenges far different than those in IT networks. TXOne specializes in OT cybersecurity, with OT-native solutions designed for the equipment, environment, and day-to-day realities of industrial settings.

The post Enhancing HMI Security: How To Protect ICS Environments From Cyber Threats appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Earning Digital TrustNathalie Bijnens
    The internet of things (IoT) has been growing at a fast pace. In 2023, there were already double the number of internet-connected devices – 16 billion – than people on the planet. However, many of these devices are not properly secured. The high volume of insecure devices being deployed is presenting hackers with more opportunities than ever before. Governments around the world are realizing that additional security standards for IoT devices are needed to address the growing and important role o
     

Earning Digital Trust

9. Květen 2024 v 09:04

The internet of things (IoT) has been growing at a fast pace. In 2023, there were already double the number of internet-connected devices – 16 billion – than people on the planet. However, many of these devices are not properly secured. The high volume of insecure devices being deployed is presenting hackers with more opportunities than ever before. Governments around the world are realizing that additional security standards for IoT devices are needed to address the growing and important role of the billions of connected devices we rely on every day. The EU Cyber Resilience Act and the IoT Cybersecurity Improvement Act in the United States are driving improved security practices as well as an increased sense of urgency.

Digital trust is critical for the continued success of the IoT. This means that security, privacy, and reliability are becoming top concerns. IoT devices are always connected and can be deployed in any environment, which means that they can be attacked via the internet as well as physically in the field. Whether it is a remote attacker getting access to a baby monitor or camera inside your house, or someone physically tampering with sensors that are part of a critical infrastructure, IoT devices need to have proper security in place.

This is even more salient when one considers that each IoT device is part of a multi-party supply chain and is used in systems that contain many other devices. All these devices need to be trusted and communicate in a secure way to maintain the privacy of their data. It is critical to ensure that there are no backdoors left open by any link in the supply chain, or when devices are updated in the field. Any weak link exposes more than just the device in question to security breaches; it exposes its entire system – and the IoT itself – to attacks.

A foundation of trust starts in the hardware

To secure the IoT, each piece of silicon in the supply chain needs to be trusted. The best way to achieve this is by using a hardware-based root of trust (RoT) for every device. An RoT is typically defined as “the set of implicitly trusted functions that the rest of the system or device can use to ensure security.” The core of an RoT consists of an identity and cryptographic keys rooted in the hardware of a device. This establishes a unique, immutable, and unclonable identity to authenticate a device in the IoT network. It establishes the anchor point for the chain of trust, and powers critical system security use cases over the entire lifecycle of a device.

Protecting every device on the IoT with a hardware-based RoT can appear to be an unreachable goal. There are so many types of systems and devices and so many different semiconductor and device manufacturers, each with their own complex supply chain. Many of these chips and devices are high-volume/low-cost and therefore have strict constraints on additional manufacturing or supply chain costs for security. The PSA Certified 2023 Security Report indicates that 72% of tech decision makers are interested in the development of an industry-led set of guidelines to make reaching the goal of a secure IoT more attainable.

Security frameworks and certifications speed-up the process and build confidence

One important industry-led effort in standardizing IoT security that has been widely adopted is PSA Certified. PSA stands for Platform Security Architecture and PSA Certified is a global partnership addressing security challenges and uniting the technology ecosystem under a common security baseline, providing an easy-to consume and comprehensive methodology for the lab-validated assurance of device security. PSA Certified has been adopted by the full supply chain from silicon providers, software vendors, original equipment manufacturers (OEMs), IP providers, governments, content service providers (CSPs), insurance vendors and other third-party schemes. PSA Certified was the winner of the IoT Global Awards “Ecosystem of the year” in 2021.

PSA Certified lab-based evaluations (PSA Certified Level 2 and above) have a choice of evaluation methodologies, including the rigorous SESIP-based methodology (Security Evaluation Standard for IoT Platforms from GlobalPlatform), an optimized security evaluation methodology, designed for connected devices. PSA Certified recognizes that a myriad of different regulations and certification frameworks create an added layer of complexity for the silicon providers, OEMs, software vendors, developers, and service providers tasked with demonstrating the security capability of their products. The goal of the program is to provide a flexible and efficient security evaluation method needed to address the unique complexities and challenges of the evolving digital ecosystem and to drive consistency across device certification schemes to bring greater trust.

The PSA Certified framework recognizes the importance of a hardware RoT for every connected device. It currently provides incremental levels of certified assurance, ranging from a baseline Level 1 (application of best-practice security principles) to a more advanced Level 3 (validated protection against substantial hardware and software attacks).

PSA Certified RoT component

Among the certifications available, PSA Certified offers a PSA Certified RoT Component certification program, which targets separate RoT IP components, such as physical unclonable functions (PUFs), which use unclonable properties of silicon to create a robust trust (or security) anchor. As shown in figure 1, the PSA-RoT Certification includes three levels of security testing. These component-level certifications from PSA Certified validate specific security functional requirements (SFRs) provided by an RoT component and enable their reuse in a fast-track evaluation of a system integration using this component.

Fig. 1: PSA Certified establishes a chain of trust that begins with a PSA-RoT.

A proven RoT IP solution, now PSA Certified

Synopsys PUF IP is a secure key generation and storage solution that enables device manufacturers and designers to secure their products with internally generated unclonable identities and device-unique cryptographic keys. It uses the inherently random start-up values of SRAM as a physical unclonable function (PUF), which generates the entropy required for a strong hardware root of trust.

This root key created by Synopsys PUF IP is never stored, but rather recreated from the PUF upon each use, so there is never a key to be discovered by attackers. The root key is the basis for key management capabilities that enable each member of the supply chain to create its own secret keys, bound to the specific device, to protect their IP/communications without revealing these keys to any other member of the supply chain.

Synopsys PUF IP offers robust PUF-based physical security, with the following properties:

  • No secrets/keys at rest​ (no secrets stored in any memory)
    • prevents any attack on an unpowered device​
    • keys are only present when used​, limiting the window of opportunity for attacks
  • Hardware entropy source/root of trust​
    • no dependence on third parties​ (no key injection from outside)
    • no dependence on security of external components or other internal modules​
    • no dependence on software-based security​
  • Technology-independent, fully digital standard-logic CMOS IP
    • all fabs and technology nodes
    • small footprint
    • re-use in new platforms/deployments
  • Built-in error resilience​ due to advanced error-correction

The Synopsys PUF technology has been field-proven over more than a decade of deployment on over 750 million chips. And now, the Synopsys PUF has achieved the milestone of becoming the world’s first IP solution to be awarded “PSA Certified Level 3 RoT Component.” This certifies that the IP includes substantial protection against both software and hardware attacks (including side-channel and fault injection attacks) and is qualified as a trusted component in a system that requires PSA Level 3 certification.

Fault detection and other countermeasures

In addition to its PUF-related protection against physical attacks, all Synopsys PUF IP products have several built-in physical countermeasures. These include both systemic security features (such as data format validation, data authentication, key use restrictions, built in self-tests (BIST), and heath checks) as well as more specific countermeasures (such as data masking and dummy cycles) that protect against specific attacks.

The PSA Certified Synopsys PUF IP goes even one step further. It validates all inputs through integrity checks and error detection. It continuously asserts that everything runs as intended, flags any observed faults, and ensures security. Additionally, the PSA Certified Synopsys PUF IP provides hardware and software handholds to the user which assist in checking that all data is correctly transferred into and out of the PUF IP. The Synopsys PUF IP driver also supports fault detection and reporting.

Advantages of PUFs over traditional key injection and storage methods

For end-product developers, PUF IP has many advantages over traditional approaches for key management. These traditional approaches typically require key injection (provisioning secret keys into a device) and some form of non-volatile memory (NVM), such as embedded Flash memory or one-time programmable storage (OTP), where the programmed key is stored and where it needs to be protected from being extracted, overwritten, or changed. Unlike these traditional key injection solutions, Synopsys PUF IP does not require sensitive key handling by third parties, since PUF-based keys are created within the device itself. In addition, Synopsys PUF IP offers more flexibility than traditional solutions, as a virtually unlimited number of PUF-based keys can be created. And keys protected by the PUF can be added at any time in the lifecycle rather than only during manufacturing.

In terms of key storage, Synopsys PUF IP offers higher protection against physical attacks than storing keys in some form of NVM. PUF-based root keys are not stored on the device, but they are reconstructed upon each use, so there is nothing for attackers to find on the chip. Instead of storing keys in NVM, Synopsys PUF IP stores only (non-sensitive) helper data and encrypted keys in NVM on- or off-chip. The traditional approach of storing keys on the device in NVM is more vulnerable to physical attacks.

Finally, Synopsys PUF IP provides more portability. Since the Synopsys PUF IP is based on standard SRAM memory cells, it offers a process- and fab agnostic solution for key storage that scales to the most advanced technology nodes.

Conclusion

The large and steady increase in devices connected to the IoT also increases the need for digital trust and privacy. This requires flexible and efficient IoT security solutions that are standardized to streamline implementation and certification across the multiple players involved in the creation and deployment of IoT devices. The PSA Certified framework offers an easy-to-consume and comprehensive methodology for the lab-validated assurance of device security.

Synopsys PUF IP, which has been deployed in over 750 million chips, is the first-ever IP solution to be awarded “PSA Certified Level 3 RoT Component.” This certifies that the IP includes substantial protection against hardware and software attacks. Synopsys PUF IP offers IoT device makers a robust PUF-based security anchor with trusted industry-standard certification and offers the perfect balance between strong security, high flexibility, and low cost.

The post Earning Digital Trust appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Optimize Power For RF/μW Hybrid And Digital Phased ArraysKimia Azad
    Field-programmable gate arrays (FPGAs) are a critical component of both digital and hybrid phased array technology. Powering FPGAs for aerospace and defense applications comes with its own set of challenges, especially because these applications require higher reliability than many industrial or consumer technologies. This blog post will provide a brief history on beamforming and beam-steering technologies for defense applications, a guide on powering defense and space FPGAs, and a reflection on
     

Optimize Power For RF/μW Hybrid And Digital Phased Arrays

9. Květen 2024 v 09:03

Field-programmable gate arrays (FPGAs) are a critical component of both digital and hybrid phased array technology. Powering FPGAs for aerospace and defense applications comes with its own set of challenges, especially because these applications require higher reliability than many industrial or consumer technologies.

This blog post will provide a brief history on beamforming and beam-steering technologies for defense applications, a guide on powering defense and space FPGAs, and a reflection on the future of A&D communications systems.

Background

Active electronically scanned array (AESA) and phased array systems are more affordable and available than ever, namely in full digital and hybrid configuration. These systems can cover a wide RF/uW frequency spectrum and are suitable for use in RADAR and other military communications systems. AESA and phased array systems also boast advanced capabilities in beamforming and beam-steering technologies.

The emergence of 5G communications and commercial space data communications, coupled with advancements in semiconductor efficiency, has enabled companies to deploy innovative phased-array designs. However, roadblocks in efficiently powering these systems given watts consumed and dissipated, plus size and geometry constraints, can complicate development for designers.

Recent developments

Despite digital phased array technologies providing improved performance across a large part of the RF/uW frequency spectrum, their deployment is hindered by various factors. Cost barriers, power consumption, thermal constraints, latency concerns, and efficiency losses in the amplification and gain stages are some key hurdles.

Hybrid phased array is an ideal solution for meeting system level requirements without compromising on cost or power losses. This technology allows for less power consumption and reduced thermal concerns, paving the way for cost-saving solutions. The “digitizers” of hybrid phased array, such as FPGAs, are fewer and farther away from the antenna.

Powering FPGAs

FPGAs are valuable in their ability to perform extremely fast calculations in support of signal isolations, Fast Fourier Transformers (FFTs), I/Q data extractions, and in functions that form and steer radio-frequency beams. However, a key challenge in working with FPGAs lies in the necessity for consistent, sequenced power delivery.

Conclusion and future considerations

Complexities and challenges associated with deploying digital assets should not overlook industry advancements in powering FPGAs within phased array systems. With a focused approach, thermal management challenges can be mitigated, and performance optimized. Existing power solutions provide a strong and mature foundation for the on-going development and deployment of next-generation phased array systems within military and space applications. For a full guide on FPGA power considerations in aerospace and defense applications, take a look at our feature in Military Embedded Systems magazine.

The post Optimize Power For RF/μW Hybrid And Digital Phased Arrays appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • UCIe And Automotive Electronics: Pioneering The Chiplet RevolutionVinod Khera
    The automotive industry stands at the brink of a profound transformation fueled by the relentless march of technological innovation. Gone are the days of the traditional, one-size-fits-all system-on-chip (SoC) design framework. Today, we are witnessing a paradigm shift towards a more modular approach that utilizes diverse chiplets, each optimized for specific functionalities. This evolution promises to enhance the automotive system’s flexibility and efficiency and revolutionize how vehicles are
     

UCIe And Automotive Electronics: Pioneering The Chiplet Revolution

9. Květen 2024 v 09:02

The automotive industry stands at the brink of a profound transformation fueled by the relentless march of technological innovation. Gone are the days of the traditional, one-size-fits-all system-on-chip (SoC) design framework. Today, we are witnessing a paradigm shift towards a more modular approach that utilizes diverse chiplets, each optimized for specific functionalities. This evolution promises to enhance the automotive system’s flexibility and efficiency and revolutionize how vehicles are designed, built, and operated.

At the heart of this transformation lies the Universal Chiplet Interconnect Express (UCIe), a groundbreaking standard introduced in March 2022. UCIe is designed to drastically simplify the integration process across different chiplets from various manufacturers by standardizing die-to-die connections. This initiative caters to a critical need within the industry for a modular and scalable semiconductor architecture, thus setting the stage for unparalleled innovation in automotive electronics.

Understanding the core benefit of UCIe

UCIe isn’t merely about facilitating smoother communication between chiplets; it’s a visionary standard that ensures interoperability, reduces design complexity and costs, and, crucially, supports the seamless incorporation of chiplets into cohesive packages. Developed through the collaborative efforts of the UCIe consortium, this standard enables the use of chiplets from different vendors, thereby fostering a highly customizable and scalable solution. This mainly benefits the automotive sector, where high performance, reliability, and efficiency demand is paramount.

The pivotal role of UCIe in automotive electronics

Implementing UCIe within automotive electronics unlocks a plethora of advantages. Foremost among these is the ability to design more compact, powerful, and energy-efficient electronic systems. UCIe offers savings in energy per bit compared to other serial or parallel interfaces.

Given the increasing complexity and functionality of modern vehicles — which can be likened to data centers on wheels — the modular design principle of UCIe is invaluable. It facilitates the addition of new functionalities and ensures that automotive electronics can seamlessly adapt to and incorporate emerging technologies and standards.

Secondly, the adoption of UCIe marks a significant stride toward sustainable electronic design within the automotive industry. By promoting the reuse and integration of chiplets across different platforms and vehicle models, UCIe significantly mitigates electronic waste and streamlines the lifecycle management of electronic components. This benefits manufacturers regarding cost and efficiency and aligns with broader industry trends focused on sustainability and environmental stewardship.

Offerings and solutions

Cadence offers a range of Intellectual Properties (IPs), Electronic Design Automation (EDA) tools, and 3D Integrated Circuits (ICs) tailored for chiplets.

Since the beginning of UCIe, Cadence has engaged with over 100 customers and enables automotive solutions with UCIe and its working groups. The implementation features for automotive protect the mainband, and Cadence also adds protection for the sideband, particularly for the automotive industry. UCIe is being developed across multiple foundries, process nodes, and standard and advanced packaging enablement types.

Cadence has a history of chiplet experience with an interface IP portfolio, including our chiplet die-to-die interconnects, including the UCIe IP, and proprietary 40G UltraLink D2D PHY that enables SoC providers to deliver more customized solutions that offer higher performance and yields while also shortening development cycles and reducing costs through greater IP reuse. Cadence’s UCIe PHY solutions facilitate high-speed communication and interoperability among chiplets. Additionally, the Cadence Simulation VIP for UCIe ensures that systems using UCIe are reliable and performant, addressing one of the industry’s key concerns.

The Cadence UCIe PHY is a high-bandwidth, low-power, and low-latency die-to-die solution that enables multi-die system in package integration for high-performance compute, AI/ML, 5G, automotive and networking applications. The UCIe physical layer includes link initialization, training, power management states, lane mapping, lane reversal, and scrambling. The UCIe controller includes the die-to-die adapter layer and the protocol layer. The adapter layer ensures reliable transfer through link state management, protocol, and flit formats parameter negotiation. The UCIe architecture supports multiple standard protocols such as PCIe, CXL and streaming raw mode.

For interface IP, we usually create various test chips using different process technologies and measure the silicon in the lab to prove that our controller and PHY IPs fully comply with the standard. Hence, we designed a test chip with seven chiplets connected via UCIe over different interconnect distances. To learn more, click here.

The Cadence Verification IP (VIP) for Universal Chiplet Interconnect Express (UCIe) is designed for easy integration in test benches at the IP, system-on-chip (SoC), and system level. The VIP for UCIe runs on all simulators and supports SystemVerilog and the widely adopted Universal Verification Methodology (UVM). This enables verification teams to reduce the time spent on environment development and redirect it to cover a larger verification space, accelerate verification closure, and ensure end-product quality. With a layered architecture and powerful callback mechanism, verification engineers can verify UCIe features at each functional layer (PHY, D2D, Protocol) and create highly targeted designs while using the latest design methodologies for random testing to cover a larger verification space. The VIP for UCIe can be used as a standalone stack or layered with PCIe VIP.

Challenges and future prospects

The transition to chiplet-based designs and the widespread adoption of the UCIe standard are not without their challenges. Critical concerns include functional safety, reliability, quality, security, thermal management, and mechanical stress. Hence, ensuring dependable high-speed interconnects between chiplets necessitates ongoing research and development. Successful implementation also hinges on industry-wide collaboration and establishing a robust chiplet ecosystem that supports the UCIe standard.

UCIe stands as a pioneering innovation poised to redefine automotive electronics. By offering a scalable and flexible framework, UCIe promises to enhance in-vehicle electronic systems’ performance, efficiency, and versatility and marks a significant milestone in the automotive industry’s evolution. As we look to the future, the impact of UCIe in the automotive sector is poised to be profound, turning vehicles into dynamic platforms that continuously adapt to technological advancements and user needs. The realization of modular, customizable designs underscored by efficiency, performance, and innovation heralds the dawn of a new era in semiconductor development, one where the possibilities are as boundless as the technological horizon.

The post UCIe And Automotive Electronics: Pioneering The Chiplet Revolution appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Interoperability And Automation Yield A Scalable And Efficient Safety WorkflowAnn Keffer
    By Ann Keffer, Arun Gogineni, and James Kim Cars deploying ADAS and AV features rely on complex digital and analog systems to perform critical real-time applications. The large number of faults that need to be tested in these modern automotive designs make performing safety verification using a single technology impractical. Yet, developing an optimized safety methodology with specific fault lists automatically targeted for simulation, emulation and formal is challenging. Another challenge is c
     

Interoperability And Automation Yield A Scalable And Efficient Safety Workflow

7. Březen 2024 v 09:07

By Ann Keffer, Arun Gogineni, and James Kim

Cars deploying ADAS and AV features rely on complex digital and analog systems to perform critical real-time applications. The large number of faults that need to be tested in these modern automotive designs make performing safety verification using a single technology impractical.

Yet, developing an optimized safety methodology with specific fault lists automatically targeted for simulation, emulation and formal is challenging. Another challenge is consolidating fault resolution results from various fault injection runs for final metric computation.

The good news is that interoperability of fault injection engines, optimization techniques, and an automated flow can effectively reduce overall execution time to quickly close-the-loop from safety analysis to safety certification.

Figure 1 shows some of the optimization techniques in a safety flow. Advanced methodologies such as safety analysis for optimization and fault pruning, concurrent fault simulation, fault emulation, and formal based analysis can be deployed to validate the safety requirements for an automotive SoC.

Fig. 1: Fault list optimization techniques.

Proof of concept: an automotive SoC

Using an SoC level test case, we will demonstrate how this automated, multi-engine flow handles the large number of faults that need to be tested in advanced automotive designs. The SoC design we used in this test case had approximately three million gates. First, we used both simulation and emulation fault injection engines to efficiently complete the fault campaigns for final metrics. Then we performed formal analysis as part of finishing the overall fault injection.

Fig. 2: Automotive SoC top-level block diagram.

Figure 3 is a representation of the safety island block from figure 2. The color-coded areas show where simulation, emulation, and formal engines were used for fault injection and fault classification.

Fig. 3: Detailed safety island block diagram.

Fault injection using simulation was too time and resource consuming for the CPU core and cache memory blocks. Those blocks were targeted for fault injection with an emulation engine for efficiency. The CPU core is protected by a software test library (STL) and the cache memory is protected by ECC. The bus interface requires end-to-end protection where fault injection with simulation was determined to be efficient. The fault management unit was not part of this experiment. Fault injection for the fault management unit will be completed using formal technology as a next step.

Table 1 shows the register count for the blocks in the safety island.

Table 1: Block register count.

The fault lists generated for each of these blocks were optimized to focus on the safety critical nodes which have safety mechanisms/protection.

SafetyScope, a safety analysis tool, was run to create the fault lists for the FMs for both the Veloce Fault App (fault emulator) and the fault simulator and wrote the fault lists to the functional safety (FuSa) database.

For the CPU and cache memory blocks, the emulator inputs the synthesized blocks and fault injection/fault detection nets (FIN/FDN). Next, it executed the stimulus and captured the states of all the FDNs. The states were saved and used as a “gold” reference for comparison against fault inject runs. For each fault listed in the optimized fault list, the faulty behavior was emulated, and the FDNs were compared against the reference values generated during the golden run, and the results were classified and updated in the fault database with attributes.

Fig. 4: CPU cluster. (Source from https://developer.arm.com/Processors/Cortex-R52)

For each of the sub parts shown in the block diagram, we generated an optimized fault list using the analysis engine. The fault lists are saved into individual session in the FuSa database. We used the statistical random sampling on the overall faults to generate the random sample from the FuSa database.

Now let’s look at what happens when we take one random sample all the way through the fault injection using emulation. However, for this to completely close on the fault injection, we processed N samples.

Table 2: Detected faults by safety mechanisms.

Table 3 shows that the overall fault distribution for total faults is in line with the fault distribution of the random sampled faults. The table further captures the total detected faults of 3125 out of 4782 total faults. We were also able model the detected faults per sub part and provide an overall detected fault ratio of 65.35%. Based on the faults in the random sample and our coverage goal of 90%, we calculated that the margin of error (MOE) is ±1.19%.

Table 3: Results of fault injection in CPU and cache memory.

The total detected (observed + unobserved) 3125 faults provide a clear fault classification. The undetected observed also provide a clear classification for Residual faults. We did further analysis of undetected unobserved and not injected faults.

Table 4: Fault classification after fault injection.

We used many debug techniques to analyze the 616 Undetected Unobserved faults. First, we used formal analysis to check the cone of influence (COI) of these UU faults. The faults which were outside the COI were deemed safe, and there were five faults which were further dropped from analysis. For the faults which were inside the COI, we used engineering judgment with justification of various configurations like, ECC, timer, flash mem related etc. Finally, using formal and engineering judgment we were able to further classify 616 UU faults into safe faults and remaining UU faults into conservatively residual faults. We also reviewed the 79 residual faults and were able to classify 10 faults into safe faults. The not injected faults were also tested against the simulation model to check if any further stimulus is able to inject those faults. Since no stimulus was able to inject these faults, we decided to drop these faults from our consideration and against the margin of error accordingly. With this change our new MOE is ±1.293%.

In parallel, the fault simulator pulled the optimized fault lists for the failure modes of the bus block and ran fault simulations using stimulus from functional verification. The initial set of stimuli didn’t provide enough coverage, so higher quality stimuli (test vectors) were prepared, and additional fault campaigns were run on the new stimuli. All the fault classifications were written into the FuSa database. All runs were parallel and concurrent for overall efficiency and high performance.

Safety analysis using SafetyScope helped to provide more accuracy and reduce the iteration of fault simulation. CPU and cache mem after emulation on various tests resulted an overall SPFM of over 90% as shown in Table 5.

Table 5: Overall results.

At this time not all the tests for BUS block (end to end protection) doing the fault simulation had been completed. Table 6 shows the first initial test was able to resolve the 9.8% faults very quickly.

Table 6: Percentage of detected faults for BUS block by E2E SM.

We are integrating more tests which have high traffic on the BUS to mimic the runtime operation state of the SoC. The results of these independent fault injections (simulation and emulation) were combined for calculating the final metrics on the above blocks, with the results shown in Table 7.

Table 7: Final fault classification post analysis.

Conclusion

In this article we shared the details of a new functional safety methodology used in an SoC level automotive test case, and we showed how our methodology produces a scalable, efficient safety workflow using optimization techniques for fault injection using formal, simulation, and emulation verification engines. Performing safety analysis prior to running the fault injection was very critical and time saving. Therefore, the interoperability for using multiple engines and reading the results from a common FuSa database is necessary for a project of this scale.

For more information on this highly effective functional safety flow for ADAS and AV automotive designs, please download the Siemens EDA whitepaper Complex safety mechanisms require interoperability and automation for validation and metric closure.

Arun Gogineni is an engineering manager and architect for IC functional safety at Siemens EDA.

James Kim is a technical leader at Siemens EDA.

The post Interoperability And Automation Yield A Scalable And Efficient Safety Workflow appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Maximizing Energy Efficiency For Automotive ChipsWilliam Ruby
    Silicon chips are central to today’s sophisticated advanced driver assistance systems, smart safety features, and immersive infotainment systems. Industry sources estimate that now there are over 1,000 integrated circuits (ICs), or chips, in an average ICE car, and twice as many in an average EV. Such a large amount of electronics translates into kilowatts of power being consumed – equivalent to a couple of dishwashers running continuously. For an ICE vehicle, this puts a lot of stress on the ve
     

Maximizing Energy Efficiency For Automotive Chips

7. Březen 2024 v 09:06

Silicon chips are central to today’s sophisticated advanced driver assistance systems, smart safety features, and immersive infotainment systems. Industry sources estimate that now there are over 1,000 integrated circuits (ICs), or chips, in an average ICE car, and twice as many in an average EV. Such a large amount of electronics translates into kilowatts of power being consumed – equivalent to a couple of dishwashers running continuously. For an ICE vehicle, this puts a lot of stress on the vehicle’s electrical and charging system, leading automotive manufacturers to consider moving to 48V systems (vs. today’s mainstream 12V systems). These 48V systems reduce the current levels in the vehicle’s wiring, enabling the use of lower cost smaller-gauge wire, as well as delivering higher reliability. For EVs, higher energy efficiency of on-board electronics translates directly into longer range – the primary consideration of many EV buyers (second only to price). Driver assistance and safety features often employ redundant component techniques to ensure reliability, further increasing vehicle energy consumption. Lack of energy efficiency for an EV also means more frequent charging, further stressing the power grid and producing a detrimental effect on the environment. All these considerations necessitate the need for a comprehensive energy-efficient design methodology for automotive ICs.

What’s driving demand for compute power in cars?

Classification and processing of massive amounts of data from multiple sources in automotive applications – video, audio, radar, lidar – results in a high degree of complexity in automotive ICs as software algorithms require large amounts of compute power. Hardware architectural decisions, and even hardware-software partitioning, must be done with energy efficiency in mind. There is a plethora of tradeoffs at this stage:

  • Flexibility of a general-purpose CPU-based architecture vs. efficiency of a dedicated digital signal processor (DSP) vs. a hardware accelerator
  • Memory sub-system design: how much is required, how it will be partitioned, how much precision is really needed, just to name a few considerations

In order to enable reliable decisions, architects must have access to a system that models, in a robust manner, power, performance, and area (PPA) characteristics of the hardware, as well as use cases. The idea is to eliminate error-prone estimates and guesswork.

To improve energy efficiency, automotive IC designers also must adopt many of the power reduction techniques traditionally used by architects and engineers in the low-power application space (e.g. mobile or handheld devices), such as power domain shutoff, voltage and frequency scaling, and effective clock and data gating. These techniques can be best evaluated at the hardware design level (register transfer level, or RTL) – but with the realistic system workload. As a system workload – either a boot sequence or an application – is millions of clock cycles long, only an emulation-based solution delivers a practical turnaround time (TAT) for power analysis at this stage. This power analysis can reveal intervals of wasted power – power consumption bugs – whether due to active clocks when the data stream is not active, redundant memory access when the address for the read operation doesn’t change for many clock cycles (and/or when the address and data input don’t change for the write operation over many cycles), or unnecessary data toggles while clocks are gated off.

To cope with the huge amount of data and the requirement to process that data in real time (or near real time), automotive designers employ artificial intelligence (AI) algorithms, both in software and in hardware. Millions of multiply-accumulate (MAC) operations per second and other arithmetic-intensive computations to process these algorithms give rise to a significant amount of wasted power due to glitches – multiple signal transitions per clock cycle. At the RTL stage, with the advanced RTL power analysis tools available today, it is possible to measure the amount of wasted power due to glitches as well as to identify glitch sources. Equipped with this information, an RTL design engineer can modify their RTL source code to lower the glitch activity, reduce the size of the downstream logic, or both, to reduce power.

Working together with the RTL design engineer is another critical persona – the verification engineer. In order to verify the functional behavior of the design, the verification engineer is no longer dealing just with the RTL source: they also have to verify the proper functionality of the global power reduction techniques such as power shutoff and voltage/frequency scaling. Doing so requires a holistic approach that leverages a comprehensive description of power intent, such as the Unified Power Format (UPF). All verification technologies – static, formal, emulation, and simulation – can then correctly interpret this power intent to form an effective verification methodology.

Power intent also carries through to the implementation part of the flow, as well as signoff. During the implementation process, power can be further optimized through physical design techniques while conforming to timing and area constraints. Highly accurate power signoff is then used to check conformance to power specifications before tape-out.

Design and verification flow for more energy-efficient automotive SoCs

Synopsys delivers a complete end-to-end solution that allows IC architects and designers to drive energy efficiency in automotive designs. This solution spans the entire design flow from architecture to RTL design and verification, to emulation-driven power analysis, to implementation and, ultimately, to power signoff. Automotive IC design teams can now put in place a rigorous methodology that enables intelligent architectural decisions, RTL power analysis with consistent accuracy, power-aware physical design, and foundry-certified power signoff.

The post Maximizing Energy Efficiency For Automotive Chips appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Microarchitecture Vulnerabilities: Uncovering The Root Cause WeaknessesJason Oberg
    In early 2018, the tech industry was shocked by the discovery of hardware microarchitecture vulnerabilities that bypassed decades of work put into software and application security. Meltdown and Spectre exploited performance features in modern application processors to leak sensitive information about victim programs to an adversary. This leakage occurs through the hardware itself, meaning that malicious software can extract secret information from users even if software protections are in place
     

Microarchitecture Vulnerabilities: Uncovering The Root Cause Weaknesses

7. Březen 2024 v 09:05

In early 2018, the tech industry was shocked by the discovery of hardware microarchitecture vulnerabilities that bypassed decades of work put into software and application security. Meltdown and Spectre exploited performance features in modern application processors to leak sensitive information about victim programs to an adversary. This leakage occurs through the hardware itself, meaning that malicious software can extract secret information from users even if software protections are in place because the leakages happen below the view of software in hardware. Since these so-called transient execution vulnerabilities were first publicly disclosed, dozens of variants have been identified that all share a set of common root cause weaknesses, but the specifics of that commonality were not well understood broadly by the security community.

In early 2020, Intel Corporation, MITRE, Cycuity, and others set off to establish a set of common weaknesses for hardware to enable a more proactive approach to hardware security to reduce the risk of a hardware vulnerability in the future. The initial set of weaknesses, in the form of Common Weakness Enumerations (CWE), were broad and covered weaknesses beyond just transient execution vulnerabilities like Meltdown and Spectre. While this initial set of CWEs was extremely effective at covering the root causes across the entire hardware vulnerability landscape, the precise and specific coverage of transient execution vulnerabilities was still lacking. This was primarily because of the sheer complexity, volume, and cleverness of each of these vulnerabilities.

In the fall of 2022, technical leads from AMD, Arm, Intel (special kudos to Intel for initiating and leading the effort), Cycuity, and Riscure came together to dig into the details of publicly disclosed transient execution vulnerabilities to really understand their root cause and come up with a set of precise, yet comprehensive, root cause weaknesses expressed as CWEs to help the industry not only understand the root cause for these microarchitecture vulnerabilities but to help prevent future, unknown vulnerabilities from being discovered. The recent announcement of the four transient execution weaknesses was a result of this collaborative effort over the last year.

CWEs for microarchitecture vulnerabilities

To come up with these root cause weaknesses, we researched every known publicly disclosed microarchitecture vulnerability (Common Vulnerabilities and Exposures [CVEs]) to understand the exact characteristics of the vulnerabilities and what the root causes were. As a result of this, the following common weaknesses were discovered, with a brief summary provided in layman terms from my perspective:

CWE-1421: Exposure of Sensitive Information in Shared Microarchitectural Structures during Transient Execution

  • Potentially leaky microarchitectural resources are shared with an adversary. For example, sharing a CPU cache between victim and attacker programs has shown to result in timing side channels that can leak secrets about the victim.

CWE-1422: Exposure of Sensitive Information caused by Incorrect Data Forwarding during Transient Execution

  • The forwarding or “flow” of information within the microarchitecture can result in security violations. Often various events (speculation, page faults, etc.) will cause data to be incorrectly forwarded from one location of the processor to another (often to a leaky microarchitecture resource like the one listed in CWE-1421)

CWE-1423: Exposure of Sensitive Information caused by Shared Microarchitectural Predictor State that Influences Transient Execution

  • An attacker being able to affect or “poison” a microarchitecture predictor used within the processor. For example, branch prediction is commonly used to increase performance to speculatively fetch instructions based on the expected outcome of a branch in a program. If an adversary is able to affect the branch prediction itself, they can cause the victim to execute code in branches of their choosing.

CWE-1420: Exposure of Sensitive Information during Transient Execution

  • A general transient execution weakness if one of the other weaknesses above do not quite fit the need.

Within each of the CWEs listed above, you can find details about observed examples, or vulnerabilities, which are a result of these weaknesses. Some vulnerabilities, Spectre-V1, for example, requires the presence of CWE-1421, CWE-1422, and CWE-1423. While others, like Meltdown, only require CWE-1422 and CWE-1423.

Since detecting these weaknesses can be a daunting task, each of the CWEs outline a set of detection methods. One detection method that is highlighted in each CWE entry is the use of information flow to track the flow of information in the microarchitecture to ensure data is being handled securely. Information flow can be used for each of the CWEs as follows:

  • CWE-1421: information flow analysis can be used to ensure that secrets never end up in a shared microarchitectural resource.
  • CWE-1422: ensure that secret information is never improperly forwarded within the microarchitecture.
  • CWE-1423: ensure that an attacker can never affect or modify the predictor state in a way that is observable by the victim. In other words, information from the attacker should not flow to the predictor if that information can affect the integrity of the predictor for the victim.

Our Radix products use information flow at their core and we have already shown success in demonstrating Radix’s ability to detect Meltdown and Spectre. We look forward to continuing to work with the industry and our customers and partners to further advance the state of hardware security and reduce the risk of vulnerabilities being discovered in the future.

The post Microarchitecture Vulnerabilities: Uncovering The Root Cause Weaknesses appeared first on Semiconductor Engineering.

  • ✇Semiconductor Engineering
  • Accelerate Complex Algorithms With Adaptable Signal Processing SolutionsJayson Bethurem
    Technology is continuously advancing and exponentially increasing the amount of data produced. Data comes from a multitude of sources and formats, requiring systems to process different algorithms. Each of these algorithms present their own challenges including low-latency and deterministic processing to keep up with incoming data rates and rapid response time. Considering that many of these semiconductors are designed years in advance of evolving technology, it places a challenging problem for
     

Accelerate Complex Algorithms With Adaptable Signal Processing Solutions

7. Březen 2024 v 09:03

Technology is continuously advancing and exponentially increasing the amount of data produced. Data comes from a multitude of sources and formats, requiring systems to process different algorithms. Each of these algorithms present their own challenges including low-latency and deterministic processing to keep up with incoming data rates and rapid response time. Considering that many of these semiconductors are designed years in advance of evolving technology, it places a challenging problem for IC designers. Take video systems for example, the resolution and color depth of imaging sensors is doubling every few years, affecting how the generated data gets processed. Alternatively, AI algorithms update nearly every year. In both cases, not only do the data paths need to increase in width and throughput, but so does the memory for weights and activations. It’s nearly unimaginable to build an IC today that doesn’t have some level of adaptability.

Fig. 1: Common adaptive noise filter design.

This is not a new concept. For years, many engineers have utilized FPGAs for this type of processing, which spans from I/Q data from radio comms to video streams from image sensors to BLDC motor control algorithms to AI models. FPGAs are perfect for handling complex algorithms that benefit from parallel and pipelined processing. Additionally, FPGA architectures are loaded with embedded memory, which can be tightly coupled to increase determinism and performance of algorithms, whereas processors can be bogged down with memory fetching, cache misses and low-level interrupts.

As data evolves and becomes more complex, FPGAs have followed suit. They have greatly increased their processing capability by building hardened signal processing blocks, which too have increased in capability over time. Many SoCs and ASICs have adjacent FPGAs to solve these processing challenges and cover this capability. However, discrete FPGA implementations have a few drawbacks, namely price and power, but also limited data transactions between the FPGA and external components like processors. But with Flex Logix eFPGA IP, any device can adopt this level of capability and reduce discrete overhead of FPGA cost and power by nearly 90%.

Like traditional FPGAs, Flex Logix EFLX IP includes 6-input LUT programmable logic, embedded memory and DSP blocks with 22×22 multipliers with 48-bit accumulators.

Fig. 2: EFLX eFPGA IP.

Unlike traditional FPGAs, these IP blocks can be scaled to fit your specific application. And depending on your application, you can select more or less DSP vs. logic as well as memory vs logic ratios. Thus, algorithms needing more memory and multipliers than logic can utilize higher ratios of DSP cores.

Flex Logix IP has evolved with signal processing demands and recently introduced InferX IP, which can dramatically increase performance and lower power consumption. InferX is effectively a scalable one-dimensional tensor processor (vector & matrix) controlled by the eFPGA fabric, which allows this IP to adapt to any signal processing algorithm implementation, including AI models. InferX has roughly 10 times the DSP performance of the aforementioned DSP IP and uses only one-quarter of the area. And while many associate TPUs with AI applications, this IP is ideal for any vector/matrix computation.

Fig. 3: InferX IP scalable from 1/8th of a tile to > 8 tiles.

InferX achieves up to dozens of TeraMACs/second at TSMC 5nm node. It is ideal for applications including FFT, FIR, IIR, Beam Forming, Matrix/Vector operations, Matrix Inversions, Kalman functions and more. It can handle Real or Complex, INT16x16 with accumulation at INT40 for accuracy. Multiple DSP operations can be pipelined in streaming mode or packet mode. See below for more benchmarks for common algorithms running on TSMC’s 5nm node.

InferX DSP solutions are easily programmed via common tools like Matlab Simulink. Flex Logix has built a ready-to-use standard Simulink block set that provides a simplified configuration, bit-accurate modeling with flexible precision.

Fig. 4: Simulink design flow.

Fig. 5: Cycle-accurate simulation of InferX soft logic driving InferX TPUs ensures functionality.

InferX IP works seamlessly with Flex Logix EFLX eFPGA IP and can be reconfigured in microseconds, enabling ICs to adapt to any data stream and the appropriate algorithm in near real-time. For ASIC manufacturers to accomplish this, they would have to multiplex between several hardened algorithms, forfeiting future algorithm and new data stream change. Flex Logix IP is the perfect adaptable accelerator for all semiconductors and is available for many nodes including advanced nodes like TSMC 5nm and 3nm as well as planned for Intel 18A.

Want to learn more about Flex Logix EFLX IP and signal processing solutions?
Contact us at info@flex-logix.com to learn more or visit our website https://flex-logix.com.

The post Accelerate Complex Algorithms With Adaptable Signal Processing Solutions appeared first on Semiconductor Engineering.

❌
❌