Technology is continuously advancing and exponentially increasing the amount of data produced. Data comes from a multitude of sources and formats, requiring systems to process different algorithms. Each of these algorithms present their own challenges including low-latency and deterministic processing to keep up with incoming data rates and rapid response time. Considering that many of these semiconductors are designed years in advance of evolving technology, it places a challenging problem for IC designers. Take video systems for example, the resolution and color depth of imaging sensors is doubling every few years, affecting how the generated data gets processed. Alternatively, AI algorithms update nearly every year. In both cases, not only do the data paths need to increase in width and throughput, but so does the memory for weights and activations. It’s nearly unimaginable to build an IC today that doesn’t have some level of adaptability.
Fig. 1: Common adaptive noise filter design.
This is not a new concept. For years, many engineers have utilized FPGAs for this type of processing, which spans from I/Q data from radio comms to video streams from image sensors to BLDC motor control algorithms to AI models. FPGAs are perfect for handling complex algorithms that benefit from parallel and pipelined processing. Additionally, FPGA architectures are loaded with embedded memory, which can be tightly coupled to increase determinism and performance of algorithms, whereas processors can be bogged down with memory fetching, cache misses and low-level interrupts.
As data evolves and becomes more complex, FPGAs have followed suit. They have greatly increased their processing capability by building hardened signal processing blocks, which too have increased in capability over time. Many SoCs and ASICs have adjacent FPGAs to solve these processing challenges and cover this capability. However, discrete FPGA implementations have a few drawbacks, namely price and power, but also limited data transactions between the FPGA and external components like processors. But with Flex Logix eFPGA IP, any device can adopt this level of capability and reduce discrete overhead of FPGA cost and power by nearly 90%.
Like traditional FPGAs, Flex Logix EFLX IP includes 6-input LUT programmable logic, embedded memory and DSP blocks with 22×22 multipliers with 48-bit accumulators.
Fig. 2: EFLX eFPGA IP.
Unlike traditional FPGAs, these IP blocks can be scaled to fit your specific application. And depending on your application, you can select more or less DSP vs. logic as well as memory vs logic ratios. Thus, algorithms needing more memory and multipliers than logic can utilize higher ratios of DSP cores.
Flex Logix IP has evolved with signal processing demands and recently introduced InferX IP, which can dramatically increase performance and lower power consumption. InferX is effectively a scalable one-dimensional tensor processor (vector & matrix) controlled by the eFPGA fabric, which allows this IP to adapt to any signal processing algorithm implementation, including AI models. InferX has roughly 10 times the DSP performance of the aforementioned DSP IP and uses only one-quarter of the area. And while many associate TPUs with AI applications, this IP is ideal for any vector/matrix computation.
Fig. 3: InferX IP scalable from 1/8th of a tile to > 8 tiles.
InferX achieves up to dozens of TeraMACs/second at TSMC 5nm node. It is ideal for applications including FFT, FIR, IIR, Beam Forming, Matrix/Vector operations, Matrix Inversions, Kalman functions and more. It can handle Real or Complex, INT16x16 with accumulation at INT40 for accuracy. Multiple DSP operations can be pipelined in streaming mode or packet mode. See below for more benchmarks for common algorithms running on TSMC’s 5nm node.
InferX DSP solutions are easily programmed via common tools like Matlab Simulink. Flex Logix has built a ready-to-use standard Simulink block set that provides a simplified configuration, bit-accurate modeling with flexible precision.
Fig. 4: Simulink design flow.
Fig. 5: Cycle-accurate simulation of InferX soft logic driving InferX TPUs ensures functionality.
InferX IP works seamlessly with Flex Logix EFLX eFPGA IP and can be reconfigured in microseconds, enabling ICs to adapt to any data stream and the appropriate algorithm in near real-time. For ASIC manufacturers to accomplish this, they would have to multiplex between several hardened algorithms, forfeiting future algorithm and new data stream change. Flex Logix IP is the perfect adaptable accelerator for all semiconductors and is available for many nodes including advanced nodes like TSMC 5nm and 3nm as well as planned for Intel 18A.
Want to learn more about Flex Logix EFLX IP and signal processing solutions?
Contact us at info@flex-logix.com to learn more or visit our website https://flex-logix.com.
The post Accelerate Complex Algorithms With Adaptable Signal Processing Solutions appeared first on Semiconductor Engineering.