To demonstrate the transformative potential of embedded vision in retail, our partner e2ip developed the Edge AI Sensing Kit, based on the STM32N6 MCU, as a proof of concept. Instead of addressing a single operational challenge, this kit showcases how real-time object detection and counting—applied here to fruits and vegetables—can be performed directly at the device level. This approach eliminates the need for manual inventory checks or costly cloud infrastructure. By enabling localized, on-device intelligence, the kit illustrates how edge AI can drive operational efficiency, minimize stockouts, and scale seamlessly across diverse retail environments.

Approach

The Edge AI Sensing Kit was initially introduced to highlight the capabilities of embedded vision intelligence, with fruit detection and counting as its first demonstration scenario. This use case underscores its relevance for smart retail, where accurate, real-time inventory tracking is critical. While the initial focus is on grocery environments, the underlying edge AI vision technology is broadly applicable—spanning smart homes, smart cities, room occupancy monitoring, and other context-aware automation systems.

Traditional vision-based detection systems have largely depended on cloud computing for image data processing and analysis. While effective, these solutions require continuous connectivity, incur high bandwidth and infrastructure costs, and introduce latency. This centralized approach can limit scalability and responsiveness.

Edge AI fundamentally changes this paradigm by bringing intelligence directly to the device. Real-time processing at the edge reduces reliance on external infrastructure, cuts operational costs, and enhances speed and efficiency. This decentralized approach also improves scalability, security, and accessibility, making vision-based applications viable across a wide range of industries.
  • The Edge AI Sensing Platform Discovery Kit leverages the STM32N6 MCU to run a real-time object detection and counting model entirely on-device. Key benefits include:
  • Low-latency performance (typically under 200 ms) for immediate, real-time inventory visibility
  • Full edge processing, with no dependency on cloud servers
  • Enhanced accuracy, with AI models trained to detect multiple product types under varying lighting and arrangement conditions
  • Ultra-low power consumption, enabling continuous monitoring without straining energy budgets
  • Reduced operational costs, by eliminating the need for cloud servers, recurring data costs, or additional tagging infrastructure
smart retail application principle smart retail application principle smart retail application principle
By harnessing edge AI on the STM32N6 MCU, this solution demonstrates how intelligent, scalable, and cost-effective inventory management can be achieved in modern retail environments.

Application overview

A 5MP camera captures high-resolution images, which are processed by the image signal processing (ISP) subsystem. A downscaled 256×256 frame is sent to the neural processing unit (Neural-ART Accelerator) for fruit and hand detection. Bounding boxes are post-processed to count fruits by type and detect interactions. Events are managed by the Event Controller and logged with timestamps. In parallel, the ISP provides a 960×960 frame for video encoding via the H.264 encoder, which are streamed over USB or Wi-Fi. An IoT controller handles telemetry for remote monitoring.

smart retail application principle smart retail application principle smart retail application principle

Sensor

The sensor used in this mockup is the Sony IMX335 5MP RGB:

  • Input: 2592x1944
  • IPS resize: 960x960
  • FPS: 15

E2ip will supply additional extension boards with various sensors to meet the specific requirements of the application.

smart retail application principle smart retail application principle smart retail application principle

Dataset and model

Dataset:

  • Custom dataset (7 classes)
  • 8k original images
  • 115k augmented images
  • 510k annotations

Model:

  • YOLOv8-nano.
  • Input size: 256 x 256 x 3
  • Trainable parameters: 3,031,321
  • MACC: 6.73E+08

Results

Weights (Flash): 2.9 MB
Activations (RAM): 880 KB
Inference time: 33 ms
Inference per second: 30

Author: E2IP Technologies & Siana Systems | Last update: May, 2025

Optimized with

STM32Cube.AI
STM32Cube.AI

Most suitable for

STM32N6 Series

Most suitable for
Resources

Optimized with STM32Cube.AI

A free STM32Cube expansion package, X-CUBE-AI allows developers to convert pretrained AI algorithms automatically, such as neural network and machine learning models, into optimized C code for STM32.

Optimized with STM32Cube.AI Optimized with STM32Cube.AI Optimized with STM32Cube.AI

Most suitable for STM32N6 Series

The STM32 family of 32-bit microcontrollers based on the Arm Cortex®-M processor is designed to offer new degrees of freedom to MCU users. It offers products combining very high performance, real-time capabilities, digital signal processing, low-power / low-voltage operation, and connectivity, while maintaining full integration and ease of development.

Most suitable for STM32N6 Series Most suitable for STM32N6 Series Most suitable for STM32N6 Series
You might also be interested by

Customer | STM32Cube.AI | Current sensor | Accelerometer | Predictive maintenance | Transportation

How Panasonic makes e-bikes smarter with AI

Tire pressure monitoring solution to improve rider safety and convenience.

Tutorial | Demo | MEMS MLC | Accelerometer | Industrial | Predictive maintenance

How to monitor and classify fan-coil systems with STWIN.box

Monitor and classify the behavior of a fan (e.g. on HVAC units) through the Machine Learning Core available in MEMS sensors.

Partner | Smart city | Transportation | Vision | STM32Cube.AI | STM32 AI MCU | Video

Number-Plate Recognition (ANPR) based on Vision AI by Irida Labs

Vision AI-powered solution for Automatic Number-Plate Recognition (ANPR) for smart city applications, running on STM32 MCUs