MemryX MX3 edge AI accelerator delivers up to 5 TOPS, is offered in die, package, and M.2 and mPCIe modules

Od: Dennis Mwihia

16. Květen 2024 v 05:55

Jean-Luc noted the MemryX MX3 edge AI accelerator module while covering the DeGirum ORCA M.2 and USB Edge AI accelerators last month, so today, we’ll have a look at this AI chip and corresponding modules that run computer vision neural networks using common frameworks such as TensorFlow, TensorFlow Lite, ONNX, PyTorch, and Keras.

MemryX MX3 Specifications

MemryX hasn’t disclosed much performance stats about this chip. All we know is it offers more than 5 TFLOPs. The listed specifications include:

Bfloat16 activations
Batch = 1
Weights: 4, 8, and 16-bit
~10M parameters stored on-die
Host interfaces – PCIe Gen 3 I/O and/or USB 2.0/3.x
Power consumption – ~1.0W
1-click compilation for the MX-SDK when mapping neural networks that have multiple layers

Under the hood, the MX3 features MemryX Compute Engines (MCE) which are tightly coupled with at-memory computing. This design creates a native, proprietary dataflow architecture that utilizes up to 70% of the chip with just one click compared to 15-30% on traditional CPUs, GPUs, and DSPs that use legacy instruction sets and control-flow architectures after software tuning.

Form Factor

Form-factor-wise, this edge AI processor is offered either as a bare die, a single-die package, or in modules (mini PCIe or M.2) with one or more MemryX MX3 chips.

MemryX MX3 quad chip M2 module — M.2 module with four MemryX MX3 chips – Source: Mark Hachman, PCWorld

MemryX MX3 EVB

The MX3 EVB (Evaluation Board) is a PCBA with four MX3 chips, and you can cascade multiple EVB boards using a single interface to provide the required inferencing power. Each of these four chips has a single-die package.

MX3 SDK

The MX SDK helps in simulating and deploying the trained AI models. MemryX builds its products to:

Provide real-world performance per watt
Run models trained on any popular framework without requiring software changes or retraining
Provide high scalability and granularity
Run AI models equally as well on every host processor regardless of the system load
Provide the same 1-click SDK (compilation software)

This SDK’s developer hub consists of a compiler (for graph processing, mapping, and assembling), utility tools (a bit-accurate simulator, performance analyzer, profiler, chip helper tools, and template applications), and a runtime environment with APIs, OS drivers, and a dataflow runtime.

You can use the MX3 EVB with Edge Impulse deployments after installing dependencies like Python 3.8+, MemryX tools and drivers, and Edge Impulse (for Linux). Next, connect the board to Edge Impulse, then verify it is connected by going to your projects and clicking “devices”.

MemryX MX3 demo

While the company hasn’t provided much detail about the chip’s performance, they did upload a video demo using the virtual camera input of AirSim – a software that creates datasets for autonomous driving and flying – comparing a computer fitted to an MX3 M.2 module to one equipped with NVIDIA 4060 GPU.

Latency was very low while running on the MX3 module, but increased drastically when switching over to the NVIDIA 4060 GPU, and the loud noise from the cooling fans was clearly noticeable.

More details may be found on the company’s website.

The post MemryX MX3 edge AI accelerator delivers up to 5 TOPS, is offered in die, package, and M.2 and mPCIe modules appeared first on CNX Software - Embedded Systems News.

QEMU 9.0 released with Raspberry Pi 4 support and LoongArch KVM acceleration

CNX Software – Embedded Systems News

Od: Dennis Mwihia

9. Květen 2024 v 04:43

QEMU 9.0 open-source emulator just came out the other day, and it brings on board major updates and improvements to Arm, RISC-V, HPPA, LoongArch, and s390x emulation. But the most notable updates are in Arm and LoongArch emulation.

The QEMU 9.0 emulator now supports the Raspberry Pi 4 Model B, meaning you can run the 64-bit Raspberry Pi OS for testing applications without owning the hardware. However, QEMU 9.0 has some limitations since Ethernet and PCIe are not supported for the Raspberry Pi board. According to the developers, these features will come on board in a future release. For now, the emulator supports SPI and I2C (BSC) controllers.

Still on ARM, QEMU 9.0 provides board support for the mp3-an536 (MPS3 dev board + AN536 firmware) and B-L475E-IOT01A IoT node, plus architectural feature support for Nested Virtualization, Enhanced Counter Virtualization, and Enhanced Nested Virtualization.

If you develop applications for the LoongArch architecture, QEMU 9.0 supports LoongArch KVM acceleration, including LSX/LASX vector extensions. These two support the architecture’s 128-bit and 256-bit Single Instruction Multiple Data (SIMD) units, respectively.

For RISC-V, this QEMU version adds ISA/extension support for Zacas, RVA22 profiles, amocas, Ztso, and many others. You’ll also get SMBIOS support for the updated RHCT table, RISC-V virtual machine, ACPI support for SRAT, SLIT, AIA, and PLIC, and several other fixes.

HPPA and s390x have received a few updates, which include LAE fixes and emulation support for CVB, CVDG, CVBG, and CVBY instructions for s390x and SeaBIOS firmware update to version 16 for HPPA.

Overall, the QEMU 9.0 release contains over 2,700 commits from 220 authors that improve several other areas, not only the ISA emulations. For instance, the memory backends preallocations will now be handled concurrently using multiple threads, and virtio-blk will now support multiqueue, allowing the different queues of a single disk to be processed by different I/O threads. More details may be found in the announcement.

The post QEMU 9.0 released with Raspberry Pi 4 support and LoongArch KVM acceleration appeared first on CNX Software - Embedded Systems News.

BitNetMCU project enables Machine Learning on CH32V003 RISC-V MCU

CNX Software – Embedded Systems News

Od: Dennis Mwihia

7. Květen 2024 v 19:01

Neural networks and other machine learning processes are often associated with powerful processors and GPUs. However, as we’ve seen on the page, AI is also moving to the very edge, and the BitNetMCU open-source project further showcases that it is possible to run low-bit quantized neural networks on low-end RISC-V microcontrollers such as the inexpensive CH32V003.

As a reminder, the CH32V003 is based on the QingKe 32-bit RISC-V2A processor, which supports two levels of interrupt nesting. It is a compact, low-power, general-purpose 48MHz microcontroller that has 2KB SRAM with 16KB flash. The chip comes in a TSSOP20, QFN20, SOP16, or SOP8 package.

To run machine learning on the CH32V003 microcontroller, the BitNetMCU project does Quantization Aware Training (QAT) and fine-tunes the inference code and model structure, which makes it possible to surpass 99% test accuracy on a 16×16 MNIST dataset without using any multiplication instructions. This performance is impressive, considering the 48 MHz chip only has 2 kilobytes of RAM and 16 kilobytes of flash memory.

The training data pipeline for this project is based on PyTorch and consists of several Python scripts. These include:

trainingparameters.yaml configuration file to set all the parameters for the training model
training.py Python script trains the model, then stores it in the model data folder as a .pth file (weights are stored as floats, with quantization happening on the fly during training).
exportquant.py Quantized model exporting file converts the stored trained model into a quantized format and exports it into the C header file (BitNetMCU_model.h)
Optional test-inference.py script that calls the DLL (compiled from the inference code) for testing and comparing results with the original Python model

The inference engine (BitNetMCU_inference.c) is implemented in ANSI-C, which you can use with the CH32V003 RISC-V MCU or port to any other microcontroller. You can test the inference of 10 digits by compiling and executing BitNetMCU_MNIST_test.c. The model data is in the BitNetMCU_MNIST_test_data.h file, and the test data is in the BitNetMCU_MNIST_test_data.h file. You can check the code and follow the instructions in the readme.md file found on GitHub to give Machine Learning on the CH32V003 a try.

The post BitNetMCU project enables Machine Learning on CH32V003 RISC-V MCU appeared first on CNX Software - Embedded Systems News.

Android no longer supports RISC-V, for now…

CNX Software – Embedded Systems News

Od: Dennis Mwihia

5. Květen 2024 v 19:01

Google dropped RISC-V support from the Android’s Generic Kernel Image in recently merged patches. Filed under the name “Remove ACK’s support for riscv64,” the patches with the description “support for risc64 GKI kernels is discontinued” on the AOSP tracker removed RISC-V kernel support, RISC-V kernel build support, and RISC-V emulator support.

In simple terms, the next Android OS implementation that will use the latest GKI release won’t work on devices powered by RISC-V chips. Therefore, companies wanting to compile a RISC-V Android build will have to create and maintain their own branch from the Linux kernel (ACK RISC-V patches).

These abbreviations can be confusing, so let’s focus on them starting with ACK. There’s the official Linux kernel, and Google does not certify Android devices that ship with this mainline Linux kernel. Google only maintains and certifies the ACK (Android Common Kernel), which are downstream branches from the official Linux kernel. One of the main ACK branches is the android-mainline because it is the primary development branch that is forked into the Generic Kernel Branch to correspond to a specific combination of a supported Linux kernel and the Android OS version.

In a nutshell, ACK refers to a Linux core with some patches of interest to the Android community that haven’t been merged to the Mainline or Long Term Supported (LTS) Linux kernels. On the other hand, GKI refers to a kernel built from one of these ACK branches. Every Android device ships to the market while running one of these GKI branches.

Matthias Männich, a senior staff software engineer at Google, uploaded these patches for review on 26th April, and they passed the review process by 1st May, resulting in successful merging on the branches android15-6.6 and android-mainline.

The four merged Android changes on the AOSP tracker

This update might inconvenience chip companies that were planning to launch RISC-V CPUs for Android devices. Qualcomm, for instance, was planning to power the next generation of Wear OS solutions (wearable platforms) using RISC-V CPUs.

However, this merged patch is not permanent. Google hasn’t killed RISC-V support forever. According to a Google spokesperson, the company is not ready to provide a single supported image to all vendors due to the rapid rate of iteration. Therefore, Android will continue to support RISC-V, but not for now, at later at an unknown date.

In the meantime, the RISC-V community has provided a RISC-V boot and runtime services specification to system vendors and operating system vendors to help them interoperate with one another. This specification enables the OS to utilize system management and device discovery when running on a RISC-V chip and could help with OS ports including future Android RISC-V implementations.

Via Android Authority

The post Android no longer supports RISC-V, for now… appeared first on CNX Software - Embedded Systems News.

QEMU 9.0 released with Raspberry Pi 4 support and LoongArch KVM acceleration

CNX Software – Embedded Systems News

Od: Dennis Mwihia

9. Květen 2024 v 04:43

QEMU 9.0 open-source emulator just came out the other day, and it brings on board major updates and improvements to Arm, RISC-V, HPPA, LoongArch, and s390x emulation. But the most notable updates are in Arm and LoongArch emulation.

The QEMU 9.0 emulator now supports the Raspberry Pi 4 Model B, meaning you can run the 64-bit Raspberry Pi OS for testing applications without owning the hardware. However, QEMU 9.0 has some limitations since Ethernet and PCIe are not supported for the Raspberry Pi board. According to the developers, these features will come on board in a future release. For now, the emulator supports SPI and I2C (BSC) controllers.

Still on ARM, QEMU 9.0 provides board support for the mp3-an536 (MPS3 dev board + AN536 firmware) and B-L475E-IOT01A IoT node, plus architectural feature support for Nested Virtualization, Enhanced Counter Virtualization, and Enhanced Nested Virtualization.

If you develop applications for the LoongArch architecture, QEMU 9.0 supports LoongArch KVM acceleration, including LSX/LASX vector extensions. These two support the architecture’s 128-bit and 256-bit Single Instruction Multiple Data (SIMD) units, respectively.

For RISC-V, this QEMU version adds ISA/extension support for Zacas, RVA22 profiles, amocas, Ztso, and many others. You’ll also get SMBIOS support for the updated RHCT table, RISC-V virtual machine, ACPI support for SRAT, SLIT, AIA, and PLIC, and several other fixes.

HPPA and s390x have received a few updates, which include LAE fixes and emulation support for CVB, CVDG, CVBG, and CVBY instructions for s390x and SeaBIOS firmware update to version 16 for HPPA.

Overall, the QEMU 9.0 release contains over 2,700 commits from 220 authors that improve several other areas, not only the ISA emulations. For instance, the memory backends preallocations will now be handled concurrently using multiple threads, and virtio-blk will now support multiqueue, allowing the different queues of a single disk to be processed by different I/O threads. More details may be found in the announcement.

The post QEMU 9.0 released with Raspberry Pi 4 support and LoongArch KVM acceleration appeared first on CNX Software - Embedded Systems News.

BitNetMCU project enables Machine Learning on CH32V003 RISC-V MCU

CNX Software – Embedded Systems News

Od: Dennis Mwihia

7. Květen 2024 v 19:01

Neural networks and other machine learning processes are often associated with powerful processors and GPUs. However, as we’ve seen on the page, AI is also moving to the very edge, and the BitNetMCU open-source project further showcases that it is possible to run low-bit quantized neural networks on low-end RISC-V microcontrollers such as the inexpensive CH32V003.

As a reminder, the CH32V003 is based on the QingKe 32-bit RISC-V2A processor, which supports two levels of interrupt nesting. It is a compact, low-power, general-purpose 48MHz microcontroller that has 2KB SRAM with 16KB flash. The chip comes in a TSSOP20, QFN20, SOP16, or SOP8 package.

To run machine learning on the CH32V003 microcontroller, the BitNetMCU project does Quantization Aware Training (QAT) and fine-tunes the inference code and model structure, which makes it possible to surpass 99% test accuracy on a 16×16 MNIST dataset without using any multiplication instructions. This performance is impressive, considering the 48 MHz chip only has 2 kilobytes of RAM and 16 kilobytes of flash memory.

The training data pipeline for this project is based on PyTorch and consists of several Python scripts. These include:

trainingparameters.yaml configuration file to set all the parameters for the training model
training.py Python script trains the model, then stores it in the model data folder as a .pth file (weights are stored as floats, with quantization happening on the fly during training).
exportquant.py Quantized model exporting file converts the stored trained model into a quantized format and exports it into the C header file (BitNetMCU_model.h)
Optional test-inference.py script that calls the DLL (compiled from the inference code) for testing and comparing results with the original Python model

The inference engine (BitNetMCU_inference.c) is implemented in ANSI-C, which you can use with the CH32V003 RISC-V MCU or port to any other microcontroller. You can test the inference of 10 digits by compiling and executing BitNetMCU_MNIST_test.c. The model data is in the BitNetMCU_MNIST_test_data.h file, and the test data is in the BitNetMCU_MNIST_test_data.h file. You can check the code and follow the instructions in the readme.md file found on GitHub to give Machine Learning on the CH32V003 a try.

The post BitNetMCU project enables Machine Learning on CH32V003 RISC-V MCU appeared first on CNX Software - Embedded Systems News.

Android no longer supports RISC-V, for now…

CNX Software – Embedded Systems News

Od: Dennis Mwihia

5. Květen 2024 v 19:01

Google dropped RISC-V support from the Android’s Generic Kernel Image in recently merged patches. Filed under the name “Remove ACK’s support for riscv64,” the patches with the description “support for risc64 GKI kernels is discontinued” on the AOSP tracker removed RISC-V kernel support, RISC-V kernel build support, and RISC-V emulator support.

In simple terms, the next Android OS implementation that will use the latest GKI release won’t work on devices powered by RISC-V chips. Therefore, companies wanting to compile a RISC-V Android build will have to create and maintain their own branch from the Linux kernel (ACK RISC-V patches).

These abbreviations can be confusing, so let’s focus on them starting with ACK. There’s the official Linux kernel, and Google does not certify Android devices that ship with this mainline Linux kernel. Google only maintains and certifies the ACK (Android Common Kernel), which are downstream branches from the official Linux kernel. One of the main ACK branches is the android-mainline because it is the primary development branch that is forked into the Generic Kernel Branch to correspond to a specific combination of a supported Linux kernel and the Android OS version.

In a nutshell, ACK refers to a Linux core with some patches of interest to the Android community that haven’t been merged to the Mainline or Long Term Supported (LTS) Linux kernels. On the other hand, GKI refers to a kernel built from one of these ACK branches. Every Android device ships to the market while running one of these GKI branches.

Matthias Männich, a senior staff software engineer at Google, uploaded these patches for review on 26th April, and they passed the review process by 1st May, resulting in successful merging on the branches android15-6.6 and android-mainline.

The four merged Android changes on the AOSP tracker

This update might inconvenience chip companies that were planning to launch RISC-V CPUs for Android devices. Qualcomm, for instance, was planning to power the next generation of Wear OS solutions (wearable platforms) using RISC-V CPUs.

However, this merged patch is not permanent. Google hasn’t killed RISC-V support forever. According to a Google spokesperson, the company is not ready to provide a single supported image to all vendors due to the rapid rate of iteration. Therefore, Android will continue to support RISC-V, but not for now, at later at an unknown date.

In the meantime, the RISC-V community has provided a RISC-V boot and runtime services specification to system vendors and operating system vendors to help them interoperate with one another. This specification enables the OS to utilize system management and device discovery when running on a RISC-V chip and could help with OS ports including future Android RISC-V implementations.

Via Android Authority

The post Android no longer supports RISC-V, for now… appeared first on CNX Software - Embedded Systems News.

QEMU 9.0 released with Raspberry Pi 4 support and LoongArch KVM acceleration

CNX Software – Embedded Systems News

Od: Dennis Mwihia

9. Květen 2024 v 04:43

QEMU 9.0 open-source emulator just came out the other day, and it brings on board major updates and improvements to Arm, RISC-V, HPPA, LoongArch, and s390x emulation. But the most notable updates are in Arm and LoongArch emulation.

The QEMU 9.0 emulator now supports the Raspberry Pi 4 Model B, meaning you can run the 64-bit Raspberry Pi OS for testing applications without owning the hardware. However, QEMU 9.0 has some limitations since Ethernet and PCIe are not supported for the Raspberry Pi board. According to the developers, these features will come on board in a future release. For now, the emulator supports SPI and I2C (BSC) controllers.

Still on ARM, QEMU 9.0 provides board support for the mp3-an536 (MPS3 dev board + AN536 firmware) and B-L475E-IOT01A IoT node, plus architectural feature support for Nested Virtualization, Enhanced Counter Virtualization, and Enhanced Nested Virtualization.

If you develop applications for the LoongArch architecture, QEMU 9.0 supports LoongArch KVM acceleration, including LSX/LASX vector extensions. These two support the architecture’s 128-bit and 256-bit Single Instruction Multiple Data (SIMD) units, respectively.

For RISC-V, this QEMU version adds ISA/extension support for Zacas, RVA22 profiles, amocas, Ztso, and many others. You’ll also get SMBIOS support for the updated RHCT table, RISC-V virtual machine, ACPI support for SRAT, SLIT, AIA, and PLIC, and several other fixes.

HPPA and s390x have received a few updates, which include LAE fixes and emulation support for CVB, CVDG, CVBG, and CVBY instructions for s390x and SeaBIOS firmware update to version 16 for HPPA.

Overall, the QEMU 9.0 release contains over 2,700 commits from 220 authors that improve several other areas, not only the ISA emulations. For instance, the memory backends preallocations will now be handled concurrently using multiple threads, and virtio-blk will now support multiqueue, allowing the different queues of a single disk to be processed by different I/O threads. More details may be found in the announcement.

The post QEMU 9.0 released with Raspberry Pi 4 support and LoongArch KVM acceleration appeared first on CNX Software - Embedded Systems News.

BitNetMCU project enables Machine Learning on CH32V003 RISC-V MCU

CNX Software – Embedded Systems News

Od: Dennis Mwihia

7. Květen 2024 v 19:01

Neural networks and other machine learning processes are often associated with powerful processors and GPUs. However, as we’ve seen on the page, AI is also moving to the very edge, and the BitNetMCU open-source project further showcases that it is possible to run low-bit quantized neural networks on low-end RISC-V microcontrollers such as the inexpensive CH32V003.

As a reminder, the CH32V003 is based on the QingKe 32-bit RISC-V2A processor, which supports two levels of interrupt nesting. It is a compact, low-power, general-purpose 48MHz microcontroller that has 2KB SRAM with 16KB flash. The chip comes in a TSSOP20, QFN20, SOP16, or SOP8 package.

To run machine learning on the CH32V003 microcontroller, the BitNetMCU project does Quantization Aware Training (QAT) and fine-tunes the inference code and model structure, which makes it possible to surpass 99% test accuracy on a 16×16 MNIST dataset without using any multiplication instructions. This performance is impressive, considering the 48 MHz chip only has 2 kilobytes of RAM and 16 kilobytes of flash memory.

The training data pipeline for this project is based on PyTorch and consists of several Python scripts. These include:

trainingparameters.yaml configuration file to set all the parameters for the training model
training.py Python script trains the model, then stores it in the model data folder as a .pth file (weights are stored as floats, with quantization happening on the fly during training).
exportquant.py Quantized model exporting file converts the stored trained model into a quantized format and exports it into the C header file (BitNetMCU_model.h)
Optional test-inference.py script that calls the DLL (compiled from the inference code) for testing and comparing results with the original Python model

The inference engine (BitNetMCU_inference.c) is implemented in ANSI-C, which you can use with the CH32V003 RISC-V MCU or port to any other microcontroller. You can test the inference of 10 digits by compiling and executing BitNetMCU_MNIST_test.c. The model data is in the BitNetMCU_MNIST_test_data.h file, and the test data is in the BitNetMCU_MNIST_test_data.h file. You can check the code and follow the instructions in the readme.md file found on GitHub to give Machine Learning on the CH32V003 a try.

The post BitNetMCU project enables Machine Learning on CH32V003 RISC-V MCU appeared first on CNX Software - Embedded Systems News.

Android no longer supports RISC-V, for now…

CNX Software – Embedded Systems News

Od: Dennis Mwihia

5. Květen 2024 v 19:01

Google dropped RISC-V support from the Android’s Generic Kernel Image in recently merged patches. Filed under the name “Remove ACK’s support for riscv64,” the patches with the description “support for risc64 GKI kernels is discontinued” on the AOSP tracker removed RISC-V kernel support, RISC-V kernel build support, and RISC-V emulator support.

In simple terms, the next Android OS implementation that will use the latest GKI release won’t work on devices powered by RISC-V chips. Therefore, companies wanting to compile a RISC-V Android build will have to create and maintain their own branch from the Linux kernel (ACK RISC-V patches).

These abbreviations can be confusing, so let’s focus on them starting with ACK. There’s the official Linux kernel, and Google does not certify Android devices that ship with this mainline Linux kernel. Google only maintains and certifies the ACK (Android Common Kernel), which are downstream branches from the official Linux kernel. One of the main ACK branches is the android-mainline because it is the primary development branch that is forked into the Generic Kernel Branch to correspond to a specific combination of a supported Linux kernel and the Android OS version.

In a nutshell, ACK refers to a Linux core with some patches of interest to the Android community that haven’t been merged to the Mainline or Long Term Supported (LTS) Linux kernels. On the other hand, GKI refers to a kernel built from one of these ACK branches. Every Android device ships to the market while running one of these GKI branches.

Matthias Männich, a senior staff software engineer at Google, uploaded these patches for review on 26th April, and they passed the review process by 1st May, resulting in successful merging on the branches android15-6.6 and android-mainline.

The four merged Android changes on the AOSP tracker

This update might inconvenience chip companies that were planning to launch RISC-V CPUs for Android devices. Qualcomm, for instance, was planning to power the next generation of Wear OS solutions (wearable platforms) using RISC-V CPUs.

However, this merged patch is not permanent. Google hasn’t killed RISC-V support forever. According to a Google spokesperson, the company is not ready to provide a single supported image to all vendors due to the rapid rate of iteration. Therefore, Android will continue to support RISC-V, but not for now, at later at an unknown date.

In the meantime, the RISC-V community has provided a RISC-V boot and runtime services specification to system vendors and operating system vendors to help them interoperate with one another. This specification enables the OS to utilize system management and device discovery when running on a RISC-V chip and could help with OS ports including future Android RISC-V implementations.

Via Android Authority

The post Android no longer supports RISC-V, for now… appeared first on CNX Software - Embedded Systems News.

Zobrazení pro čtení

MemryX MX3 Specifications

Form Factor

MemryX MX3 EVB

MX3 SDK

MemryX MX3 demo