People presence detection (visual wake word)
Human detection on high-performance MCU.
Approach
- We selected a pre-trained NN model from Google visual wake word to manage presence detection
- The model is already integrated in the function pack FP-AI-VISION1 (made for STM32H747 discovery kit)
- The model is optimized using STM32Cube.AI
Sensor
Data
2 classes: people / no-people
Color image 96x96 image for MobileNet v1 0.25
Color image 128x128 for MobileNet v2 0.35
Results
Model: MobileNet v1 0.25 quantized
Input size: 96x96x3
Memory footprint:
214 KB Flash for weights
40 KBRAM for activations
Accuracy: 85% against Coco subset dataset
Performance on STM32H747* @ 400 MHz
Inference time: 36 ms
Frame rate: 28 fps
Model: MobileNet v2 0.35 quantized
Input size: 128x128x3
Memory footprint:
402 KB Flash for weights
224 KBRAM for activations
Accuracy: 91% against Coco subset dataset
Performance on STM32H747* @ 400 MHz
Inference time: 110 ms
Frame rate: 9 fps
* As measured with STM32CubeAI 7.1.0 in FP-AI-VISION1 3.1.0
Resources
A free STM32Cube expansion package, X-CUBE-AI allows developers to convert pretrained AI algorithms automatically, such as neural network and machine learning models, into optimized C code for STM32.
Compatible with STM32H7 series
The STM32 family of 32-bit microcontrollers based on the Arm Cortex®-M processor is designed to offer new degrees of freedom to MCU users. It offers products combining very high performance, real-time capabilities, digital signal processing, low-power / low-voltage operation, and connectivity, while maintaining full integration and ease of development.