Arm® Cortex®-M4 in a nutshell
The 32-bit Arm® Cortex®-M4 processor core is the first core of the Cortex-M line up to feature dedicated Digital Signal Processing (DSP) IP blocks, including an optional Floating-Point Unit (FPU). It addresses digital signal control applications that require efficient, easy-to-use control and signal processing capabilities, such as the IoT, motor control, power management, embedded audio, industrial and home automation, healthcare and wellness applications.
Just like the Cortex-M3 core, the Cortex-M4 core achieves 1.25 DMIPS/MHz and 3.42 CoreMark/MHz thread performance.
Key features of Arm® Cortex®-M4 core
- Armv7E-M architecture
- Bus interface 3x AMBA AHB-lite interface (Harvard bus architecture) AMBA ATB interface for CoreSight debug components
- Thumb/Thumb-2 subset instruction support
- 3-stage pipeline
- DSP extensions: single-cycle 16/32-bit MAC, Single cycle dual 16-bit MAC, 8/16-bit SIMD arithmetic, Hardware Divide (2-12 Cycles)
- Optional single precision Floating Point Unit (FPU), IEEE 754-compliant
- Optional 8 MPU regions with sub-regions and background region
- Integrated bit-field processing instructions and bus-level bit banding
- Non-maskable interrupt and 1 to 240 physical interrupts with 8 to 256 priority levels
- Wake-up interrupt controller
- Integrated WFI and WFE Instructions and Sleep-On-Exit capability, Sleep & Deep Sleep Signals, Optional Retention Mode with Arm Power Management Kit
- Optional JTAG and Serial Wire Debug ports. Up to 8 breakpoints and 4 watchpoints
- Optional Instruction Trace (ETM), Data Trace (DWT), and Instrumentation Trace (ITM)
Key advantages of Arm® Cortex®-M4 MCUs
Microcontrollers based on the Cortex-M4 core benefit from the Armv7E-M architecture. The Armv7E-M architecture is built on the Armv7-M architecture from the Cortex-M3 core and offers additional DSP extensions, such as:
- single Instruction Multiple Data (SIMD) processing
- saturation arithmetic instructions
- a wide range of MAC instructions which can execute in single cycles
- and an optional FPU, that supports single-precision floating point operations.
This architecture is perfectly suited for real-time control applications requiring highly deterministic operations with low-cycle count execution, minimum interrupt latency, a short pipeline, and the possibility to perform cache-less operations.
Digital Signal Processing
Microcontrollers based on the Cortex-M4 rely on its built-in advanced DSP hardware accelerators to process signals using mathematical calculations. The DSP hardware accelerator can process any analog signal, such as the output signal of a microphone, the feedback from a sensor embedded in a motor control system, or outputs from sensor-fusion applications.
Thanks to Digital Signal Processing, fewer cycles are required to run control-loop algorithms, therefore contributing to the performance and the power efficiency of the application. Indeed, when algorithms are processed using Q1.15 or Float32 data formats, MCUs running on a Cortex-M4 offer a much higher performance than MCUs based on the Cortex-M3. For the Q1.15 format, the improvement is mainly due to the availability of SIMD instructions, allowing the Cortex-M4 to divide the number of cycles required by about two. For the Float32 data format, the Floating-Point Unit accelerator raises the performance of the Cortex-M4 MCUs by an order of magnitude, compared to that of the Cortex-M3 MCUs.
Cortex-M4 MCUs with DSP are sometimes marketed by alternative MCU manufacturers as Cortex-M4F MCUs. All STM32 Cortex-M4 MCUs embed the DSP option of the Cortex-M4 core, and they are all named Cortex-M4 MCUs.
Scalability and power efficiency
Arm Cortex-M4 microcontrollers support the Cortex Microcontroller Software Interface Standard (CMSIS), thereby enabling developers to port their code to or from different microcontrollers for future projects. This interface also eases the integration of third-party software, helping to reduce time to market.
The flexibility and scalability of the architecture of the Cortex-M4 allow designers to run most of the recent Machine Learning algorithms.
It is also extremely power efficient. Therefore, Cortex-M4 microcontrollers are excellent choices for IoT edge controllers or battery-operated sensor nodes, as well as consumer wearables.
The Cortex-M4 core is mostly embedded in single-core MCUs. However, a new generation of multi-core microcontrollers and microprocessors pushes the limits of system integration and performance optimization, implementing two-task partitioning use cases:
- The Cortex-M4 can be used as the main control core, associated with the more energy-efficient Cortex-M0+ core, which can run radio protocols more efficiently.
- The Cortex-M4 core can be used as the real-time, general-purpose companion core to the computing horsepower of the Cortex-M7 or -A7 cores that process advanced graphics, complex digital signal processing algorithms or/and run the open-source Linux operating system and libraries.
Microcontrollers based on the Arm Cortex-M4
STMicroelectronics combines the Arm Cortex-M4 core with its unique proprietary, low-power silicon Intellectual Property, non-volatile embedded memory technology, hardware accelerators (Cordic for trigonometric & hyperbolic calculation & FMAC for filtering), high-performance architectures, and wireless connectivity expertise to offer the STM32 Arm Cortex-M4 MCUs as a solution to the many technical and commercial challenges engineers need to solve.
STM32 Cortex-M4 microcontrollers are fully integrated into the STM32Cube development environment and leverage the tools and solutions offered by ST's extensive network of partners.
Single-Core Series of Arm M4 microcontrollers
|Series||Speed (MHz)||Performance (CoreMark)||Flash (kB)||RAM (kB)||Power Supply (V)||Packages||Connectivity||Analog|
|STM32F3||72||245||16 to 512||16 to 80||1.8 to 3.6||LQFP48/64/100, UFBGA100, UFQFPN32, WLCSP49/66/100||USART, SPI, I2C, I2S, CAN, USB||Yes|
|STM32L4||80||273||64 to 1024||40 to 320||1.7 to 3.6||LQFP32/48/64/100/144, UFBGA64/100/132, UFQFPN32/48, WLCSP36/49/64/72/100||USART, SPI, I2C, CAN, USB, chrom-ART||Yes|
|STM32L4+||120||409||512 to 2048||320 to 640||1.7 to 3.6||LQFP48/64/100/144, UFBGA132/169, UFQFPN48, WLCSP100||USART, SPI, I2C, CAN, USB, TFT, MIPI DSI, Chrom-ART||Yes|
|STM32G4||170||550||32 to 512||118 to 128||1.7 to 3.6||LQFP32/48/64/80/100/128, UFBGA64/100/121, UFQFPN32/48, WLCSP48/64/81||USART, SPI, I2C, CAN, USB||Analog rich|
|STM32F4||84 to 180||up to 613||64 to 2048||32 to 384||1.7 to 3.6||LQFP48/64/100/144/176/208, BGA100/144/176/216, UFQFPN48, WLCSP36/49/81/90/143/168||USART, SPI, I2C, CAN, Ethernet, USB, TFT, MIPI DSI, Chrom-ART||Yes|
Dual-Core Series (Wireless) of Arm M4 microcontrollers
|Cortex M4 speed (MHz)||Processor 2||Flash (kB)||RAM (kB)||Power Supply (V)||Packages||Connectivity||Wireless connectivity|
|STM32WL5||48||Cortex-M0+@48MHz||64 to 256||20 to 64||1.8 to 3.6||UFBGA73, UFQFPN48||USART, SPI, I2C||150 to 960MHz, LoRa, (G)FSK, (G)MSK, BPSK|
|STM32WB||64||Cortex-M0+@32MHz||256 to 1024||128 to 256||1.71 to 3.6||UFBGA129, UFQFPN48, VFQFPN68, WLCSP100||USART, SPI, I2C, USB||2.4GHz, 802.15.4, BLE5.0, Thread/OpenThread, Zigbee3.0|
Dual-Core Series (High Performance) of Arm M4 microcontrollers
|Cortex M4 speed (MHz)||Processor 2||Flash (kB)||RAM (kB)||Power Supply (V)||Packages||Connectivity||Analog|
|1024 to 2048||1024||1.62 to 3.6||UFBGA129/169, LQFP176/208, TFBGA240, WLCSP156||USART, SPI, I2C, USB, TFT-LCD, MIPI-DSI||Advanced analog|
|STM32MP1||209||Dual Cortex-A7@800MHz |
|na||708||1.71 to 3.6||LFBGA354/448, TFBGA257/361||USART, SPI, I2C, USB HS |
Get started developing on Arm Cortex-M4 core with our recommended starter kit
STM32 Nucleo-64 development board with STM32G474RE MCU, supports Arduino and ST morpho connectivity
Explore STM32 MCU based solutions
Explore STM32 ecosystem
Explore Arm® Cortex®-M cores in STM32 32-bit microcontroller portfolio:
Exceptional 32-bit performance with low power consumption
Smallest footprint and lowest power requirements of Cortex-M processors
Smallest Arm® processor available
Highest performance Cortex-M processor
Control and performance for mixed signal devices
Ideal blend of real-time determinism, efficiency and security