STM32 SX1262 Encrypted LoRa (Part 1) Secures with AES

Overview

This STM32 SX1262 Encrypted LoRa tutorial walks through a working secure point-to-point link between two STM32 Nucleo-F439ZI boards using the Semtech SX1262 modem at 915 MHz. Built in STM32CubeIDE, the project pairs the SX1262 with an MB85RS64B SPI FRAM for replay-state persistence and uses a layered firmware design that cleanly separates the radio driver, the Wire v3 protocol layer, and the application logic.

Across the series, the firmware sends authenticated, encrypted, replay-protected frames between the two STM32 nodes. It uses AES-128-CTR for confidentiality, AES-CMAC-128 for integrity, an epoch-per-boot counter scheme for replay protection over LoRa, and a single firmware image that selects its TX or RX role at runtime from the STM32 unique device ID — no compile-time switches, no separate builds.

Part 1 covers the firmware architecture, the SX1262-to-Nucleo-F439ZI hardware wiring, the STM32CubeIDE project structure, and the Wire v3 on-air frame format. The AES-CTR encryption, AES-CMAC authentication, FRAM replay-protection logic, and hostile-frame QA harness are covered in Parts 2 and 3. The goal of Part 1 is to get the layering and the LoRa wire format absolutely clear, because every later part stands on top of them.

What You Will Learn

  • How to organize an STM32 LoRa firmware into clean layers (application, protocol, driver, persistence, HAL) so that each layer has one job and one job only.
  • How to wire an SX1262 module to a Nucleo-F439ZI for IRQ-driven operation, using DIO1 as the only signal that wakes the MCU.
  • How to use the STM32 96-bit unique device ID to select the TX or RX role at boot, so a single firmware image runs on both nodes.
  • What the Wire v3 on-air frame format looks like, byte by byte, and how the implementation derives every offset from a single packed struct using offsetof() so the parser and builder can never disagree.
  • How to read a real Wire v3 header out of a UART log and confirm by hand that the frame-length math matches the protocol invariants.

Prerequisites

This series assumes you are already comfortable with the basics of STM32CubeIDE and the STM32 HAL. Specifically, you should be familiar with creating a CubeMX project, configuring SPI and USART peripherals, enabling EXTI lines, and using printf() redirected to a UART for debug output. We will not be re-covering those steps from zero.

You should also be comfortable reading C at the level of structs, function pointers, packed types, and offsetof(). The protocol layer leans on a small amount of compile-time machinery to keep the on-wire format honest, and we will look at how it works.

No prior LoRa experience is required, but if you have never seen a Semtech SX1262 datasheet, it is worth keeping it open in another tab. We will not be re-explaining the modem command set.

Materials List

  • 2 × STM32 Nucleo-F439ZI development boards (one for the TX node, one for the RX node — the firmware image is identical on both).
  • 2 × Semtech SX1262 LoRa modules. The reference build uses the Waveshare Core1262 breakout (915 MHz variant for the US ISM band; pick the variant that matches your region).
  • 2 × Adafruit 1897 MB85RS64B SPI FRAM breakouts (8 KByte each). The TX node uses the FRAM for replay-state persistence; the RX board has it populated for symmetry but does not write to it in normal operation.
  • Two 915 MHz (or region-appropriate) antennas.
  • Jumper wires, breadboards or carrier PCBs, and two USB cables for the ST-LINK virtual COM ports.
  • STM32CubeIDE (any reasonably recent version — the project ships with its own .cproject and .ioc).
  • A serial terminal program (PuTTY, minicom, screen, etc.) at 19200 8N1 for each board.

Waveshare Core1262 LF/HF LoRa Module, SX1262 chip

Project Structure

The repository is laid out so that each layer of the firmware lives in its own directory. The directory names are also the layer names — there is no separate naming scheme to memorize.

Plaintext
F439_CPP_TX-RX_LoRa_Project_01/
├── ADA1897_MB85RS64B/         <-- SPI FRAM driver (persistence layer)
│   ├── ada1897_mb85rs64b.c
│   └── ada1897_mb85rs64b.h
├── Core/                      <-- CubeMX-generated HAL + main.c
│   ├── Inc/
│   │   ├── main.h
│   │   ├── stm32f4xx_hal_conf.h
│   │   └── stm32f4xx_it.h
│   ├── Src/
│   │   ├── main.c
│   │   ├── stm32f4xx_hal_msp.c
│   │   ├── stm32f4xx_it.c
│   │   ├── syscalls.c
│   │   ├── sysmem.c
│   │   └── system_stm32f4xx.c
│   ├── Startup/
│   └── ThreadSafe/
├── Drivers/                   <-- ST HAL + CMSIS (CubeMX-managed)
├── SX1262/                    <-- SX1262 LoRa radio driver
│   ├── sx1262.c
│   └── sx1262.h
├── radioApp/                  <-- Application layer
│   ├── radio_app.c
│   └── radio_app.h
├── radioLink/                 <-- Protocol layer (Wire v3, crypto, replay)
│   ├── radio_link.c
│   ├── radio_link.h
│   └── radio_wire.h
├── qaTests/                   <-- Hostile-frame QA harness (covered in Part 3)
│   ├── protocol/
│   └── qaApp/
├── docs/                      <-- Doxygen output
├── scripts/                   <-- Build / instrumentation helpers
├── F439_CPP_TX-RX_LoRa_Project_01.ioc
├── STM32F439ZITX_FLASH.ld
├── Doxyfile
└── README.md

The five layers, top to bottom

The firmware is organized into five layers. Each layer talks only to the one immediately below it.

  • Application layer (radioApp/) — application behavior. Knows what to send and what to do with what is received. Does not build wire frames, does not touch crypto, does not touch the SPI bus.
  • Protocol layer (radioLink/) — Wire v3 framing, AES-CTR encryption, AES-CMAC authentication, replay-counter management, and coordination with the persistence layer. This layer is intentionally transport-agnostic; the same framing code could run over USB or Ethernet tomorrow.
  • Radio driver (SX1262/) — the SX1262 command set, SPI transactions, BUSY/RESET/DIO1 handling, IRQ processing. Does not know there is a header, a CMAC, or a counter.
  • Persistence layer (ADA1897_MB85RS64B/) — SPI FRAM driver. Used by the protocol layer to store replay state across reboots. Used nowhere else.
  • HAL layer (Core/, Drivers/) — CubeMX-generated peripheral init, ISR shells, clock configuration, linker scripts.

The strict-layering rule is the single most important architectural decision in the project. If you remember nothing else from this post, remember this: the application layer never builds a wire frame, and the radio driver never knows what is in one.

Hardware Configuration / Pinouts

Overview

Both nodes are wired identically. That is part of the same-binary design — there are no role-specific jumpers, no resistor straps, no compile-time switches that change pin assignments. Every Nucleo-F439ZI in the project has the same hookup, and the firmware figures out which board it is at boot from the STM32 unique device ID.

The SX1262 module and the MB85RS64B FRAM share a single SPI bus (SPI1) with separate chip-select lines. The radio’s DIO1 line is the only asynchronous signal we care about — every TX-done, RX-done, CRC-error, and timeout event arrives on that one wire, routed through an EXTI interrupt.

Schematic

Right click the schematic image to open at full size in a new tab, then zoom for detail. Zoom by using Ctrl->Scroll Mouse Wheel

Schematic showing STM32 Nucleo-F439ZI connected to Waveshare Core1262 SX1262 LoRa module and MB85RS64V SPI FRAM via SPI1

Pinouts & Configurations

Pin assignments

The pin assignments below come straight out of Core/Inc/main.h and the SX1262_Handle initialization in main.c. They are not arbitrary — they line up with what CubeMX configured for SPI1, USART1, TIM3, and the EXTI line on PC10.

  • SPI1 — shared bus for both the SX1262 and the FRAM. Master mode, 8-bit, CPOL=0, CPHA=1-edge, MSB-first, software NSS, baud-rate prescaler /64. Configured by MX_SPI1_Init().
  • SX1262 chip-select (NSS) — PC9 (active-low, software-driven).
  • SX1262 BUSY — PC12 (input). Polled by the driver before each command.
  • SX1262 NRESET — PC11 (output, push-pull).
  • SX1262 DIO1 — PC10 (input, EXTI15_10_IRQn). The single interrupt source from the radio.
  • SX1262 TXEN / RXEN — PF6 / PF7 (outputs). Front-end RF switch control.
  • FRAM chip-select — PA15 (active-low, software-driven). Symbol SPI1_FRAM_CS_Pin in main.h.
  • USART1 — 19200 8N1, debug console, retargeted via _write(). Mapped to the ST-LINK virtual COM port on the Nucleo-F439ZI.
  • TIM3 — 1 MHz tick (prescaler 89, period 65535). Used by delay_us() for microseconds busy-waits required by the radio command timing.
  • System clock — 180 MHz from the HSI through the PLL (HSI 16 MHz, PLLM=8, PLLN=180, PLLP=2). Over-drive enabled, FLASH latency 5.

The exact pin defines, taken verbatim from main.h:

C
#define SX1262_TX_ENABLE_Pin GPIO_PIN_6
#define SX1262_TX_ENABLE_GPIO_Port GPIOF
#define SX1262_RX_ENABLE_Pin GPIO_PIN_7
#define SX1262_RX_ENABLE_GPIO_Port GPIOF
#define MCO_Pin GPIO_PIN_0
#define MCO_GPIO_Port GPIOH
#define SX1262_CS_Pin GPIO_PIN_9
#define SX1262_CS_GPIO_Port GPIOC
#define SPI1_FRAM_CS_Pin GPIO_PIN_15
#define SPI1_FRAM_CS_GPIO_Port GPIOA
#define DIO1_LORA_Pin GPIO_PIN_10
#define DIO1_LORA_GPIO_Port GPIOC
#define DIO1_LORA_EXTI_IRQn EXTI15_10_IRQn
#define SX1262_NRESET_Pin GPIO_PIN_11
#define SX1262_NRESET_GPIO_Port GPIOC
#define SX1262_BUSY_Pin GPIO_PIN_12
#define SX1262_BUSY_GPIO_Port GPIOC

Why DIO1 only

The SX1262 has three DIO pins. We use only DIO1, and we do not poll for any radio event. Every interesting thing the radio does — finishing a transmit, finishing a receive, hitting a CRC error, hitting a timeout — fires DIO1 through the IRQ-mask register. This keeps the firmware single-edge-triggered: one wire, one EXTI line, one ISR. There is no race between “polled status” and “interrupt status” because there is no polling.

The IRQ events we mask in:

  • RX role: RX_DONE, CRC_ERROR, RX_TX_TIMEOUT.
  • TX role: TX_DONE, RX_TX_TIMEOUT.

Project Setup

Open the project in STM32CubeIDE: File → Open Projects from File System, point it at the unzipped repository, and let the IDE import the .cproject. The .ioc file is included so you can re-open it in the CubeMX perspective if you want to inspect or change the peripheral configuration, but for a first build you do not need to.

The build target is Debug. It produces a single ELF that is flashed identically to both boards. There is no separate TX build and no separate RX build — that is the whole point of the runtime-role design.

Discovering and recording each board’s UID

Before you can decide which board is TX and which is RX, you need each board’s STM32 unique device ID. The firmware prints the UID on every boot:

C
  uint32_t uid[3] = {0x00};

  uid[0] = HAL_GetUIDw0();
  uid[1] = HAL_GetUIDw1();
  uid[2] = HAL_GetUIDw2();

  printf("UID words: %08lx %08lx %08lx\r\n", uid[0], uid[1], uid[2]);

Flash the firmware to each board in turn, open the serial console at 19200 8N1, and write down the three UID words. They are factory-burned, read-only, and unique to each STM32 — they are not secret, but they are stable for the life of the chip. Once you have both boards’ UIDs, paste them into main.c in the role-selection block:

C
  /* TODO: Fill these in from your UART prints (one-time) */
  const uint32_t RX_UID0 = 0x00280024U;
  const uint32_t RX_UID1 = 0x3133510aU;
  const uint32_t RX_UID2 = 0x36393739U;

  const uint32_t TX_UID0 = 0x002c003cU;
  const uint32_t TX_UID1 = 0x31335110U;
  const uint32_t TX_UID2 = 0x39323638U;

Notice the third branch. If a board boots with a UID that is not in the table, the firmware refuses to start the radio at all. There is no “default to TX” or “default to RX” — an unknown board halts in Error_Handler(). That is deliberate. A node that does not know who it is must not transmit on a shared RF band.

First boot — what you should see

With both boards programmed and powered, the TX-side console will show the application sending and the RadioLink layer reporting the message counter after each successful transmission:

And the RX-side console will show the decoded plaintext, the parsed Wire v3 header in hex, and the link quality figures:

The RXHDR line is the raw 11-byte Wire v3 header followed by the total frame length on air. We are going to decode it byte by byte in the next section, and after that you will see why the layout was designed exactly this way.

Code Walkthrough

For Part 1, the walkthrough focuses on three things: the main() bring-up sequence, the radio-application interface, and the Wire v3 frame format. Crypto, replay protection, and the QA harness will get their own dedicated parts later in the series.

main() — staying boring on purpose

The whole point of the layered design is that main() ends up almost dull. It does CubeMX-generated peripheral init, then hands control to the application layer. There is no protocol code in main(), no crypto code, no replay logic.

C
int main(void)
{
    HAL_Init();
    SystemClock_Config();

    MX_GPIO_Init();
    MX_SPI1_Init();
    MX_USART1_UART_Init();
    MX_TIM3_Init();
    MX_CRYP_Init();

    HAL_TIM_Base_Start_IT(&htim3);  /* 1 MHz tick for delay_us() */
    FRAM_init(&hspi1);

    /* ... UID-based role selection (see Project Setup) ... */

    sx.hspi      = &hspi1;
    sx.NSS_Port   = GPIOC; sx.NSS_Pin   = GPIO_PIN_9;
    sx.DIO1_Port = GPIOC; sx.DIO1_Pin  = GPIO_PIN_10;
    sx.RESET_Port = GPIOC; sx.RESET_Pin = GPIO_PIN_11;
    sx.BUSY_Port = GPIOC; sx.BUSY_Pin  = GPIO_PIN_12;
    sx.TXEN_Port = GPIOF; sx.TXEN_Pin  = GPIO_PIN_6;
    sx.RXEN_Port = GPIOF; sx.RXEN_Pin  = GPIO_PIN_7;

    HAL_GPIO_WritePin(GPIOC, GPIO_PIN_9, GPIO_PIN_SET);
    HAL_Delay(1);

    RadioApp_Init(&sx);

    while (1) {
        RadioApp_Loop();
    }
}

The SX1262_Handle is just a hardware-binding struct. It tells the driver which SPI bus to use and which GPIO pins drive NSS, RESET, BUSY, DIO1, TXEN, and RXEN. The driver does not hard-code any pin assignments — every binding is in the handle, and main() is the only file that knows the actual pin numbers. That is what makes the driver portable to a different board.

The application layer — three functions, full stop

The entire public surface of the application layer is three functions, declared in radio_app.h:

C
void RadioApp_Init(SX1262_Handle *sx);
void RadioApp_Loop(void);
void RadioApp_OnDio1Exti(uint16_t pin);

RadioApp_Init() takes the radio handle and configures the modem for the project’s LoRa settings (915 MHz, SF7, 125 kHz BW, CR 4/5, +14 dBm, CRC on, IQ normal). On the RX node, it also kicks the radio into continuous-receive mode so it is listening as soon as the main loop starts pumping.

RadioApp_Loop() is what the super-loop in main() calls every iteration. Notice how thin the IRQ side of this is:

C
static volatile uint8_t g_irq;

void RadioApp_OnDio1Exti(uint16_t pin) {
    if (pin == g_sx->DIO1_Pin)
        g_irq = 1;
}

void RadioApp_Loop(void) {
    if (sx1262Role == SX_ROLE_RX) {
        if (!g_irq) return;

        SX1262_IrqResult r;
        g_irq = 0;

        if (!SX1262_ProcessIrq(g_sx, &r)) return;

        if (r.rx_done && !r.crc_error) {
            /* hand the bytes up to RadioLink for parsing */
            /* (covered in Part 2) */
        }
    }
    /* TX path elided — covered in Part 2 */
}

Two things to note. First, the EXTI ISR does no SPI traffic. It sets one byte and returns. All radio I/O happens in thread context inside RadioApp_Loop(). Second, g_irq is the only piece of state shared between the ISR and the loop — and it is a single volatile uint8_t, written by exactly one place and cleared by exactly one place. There is no queue, no ring buffer, no critical section. The latch-and-defer pattern is all the synchronization we need.

The bridge from the HAL EXTI callback into the application layer is the standard one-liner in main.c:

C
void HAL_GPIO_EXTI_Callback(uint16_t pin) {
#ifndef RADIOLINK_QA_TEST
    RadioApp_OnDio1Exti(pin);
#else
    QaApp_OnDio1Exti(pin);
#endif
}

The RADIOLINK_QA_TEST compile-time switch is the entry point to the hostile-frame harness. We will turn it on and break things on purpose in Part 3.

The Wire v3 frame format

Every byte that goes on air is a Wire v3 frame. The layout is fixed, small, and explicitly version-tagged in the very first byte so that future revisions can coexist or be rejected cleanly.

Plaintext
+---------+--------+------------------+----------------+------------+---------+--------+
| version | nodeId | sessionSeqId_le  | msgCounter_le  | payloadLen | payload |  CMAC  |
| 1 byte  | 1 byte |     4 bytes      |    4 bytes     |   1 byte   | N bytes |  16 B  |
+---------+--------+------------------+----------------+------------+---------+--------+
<-------------------- 11-byte fixed header -------------------->

Field-by-field:

  • version0x03 for Wire v3. The parser rejects mismatched versions before doing anything else.
  • nodeId — 1-byte runtime sender ID. Derived from the MCU UID at boot, never compiled in.
  • sessionSeqId_le — 32-bit little-endian session ID, also called the “boot epoch.” Advances exactly once per TX boot. Persisted to FRAM.
  • msgCounter_le — 32-bit little-endian per-session message counter. Advances after each successful transmission. Lives in RAM only.
  • payloadLen — length of the payload region in bytes. Does not include the CMAC tag.
  • payload — AES-CTR ciphertext of length payloadLen. (We will decrypt one of these in Part 2.)
  • CMAC — 16-byte AES-CMAC-128 tag computed over header || ciphertext.

The Wire v3 protocol invariants — every parser and builder enforces all of these:

  • Fixed header is exactly 11 bytes.
  • CMAC tag is exactly 16 bytes.
  • payloadLen is the payload only, never the tag.
  • frameLen == 11 + payloadLen + 16.
  • Multi-byte integers are little-endian.
  • Any frame shorter than 27 bytes is structurally invalid and is rejected before any further work.

The current SX1262 transport limit is 255 bytes per frame, so the maximum plaintext that fits in a single Wire v3 frame is 255 - 11 - 16 = 228 bytes.

Deriving every offset from one packed struct

Hardcoded byte offsets are a classic source of protocol bugs. If the parser thinks the counter starts at byte 5 and the builder writes it at byte 6, the protocol is broken and the failure mode is silent garbage. The radio_wire.h header avoids this by declaring a single packed struct that is the Wire v3 layout, and then deriving every offset from it with offsetof():

C
#define RADIOLINK_WIRE_V3_VERSION           (0x03u)
#define RADIOLINK_WIRE_V3_TAG_LEN           (16u)

typedef struct __attribute__((packed)) radioWireV3_t {
    uint8_t  version;           /* 0x03 */
    uint8_t  nodeId;
    uint32_t sessionSeqId_le;  /* LE32 on wire */
    uint32_t msgCounter_le;   /* LE32 on wire */
    uint8_t  payloadLen;
    uint8_t  payload[1];       /* payload starts here */
} radioWireV3_t;

#define RL_W3_OFF_VERSION      (offsetof(radioWireV3_t, version))
#define RL_W3_OFF_NODE_ID      (offsetof(radioWireV3_t, nodeId))
#define RL_W3_OFF_SESSION_SEQ_ID (offsetof(radioWireV3_t, sessionSeqId_le))
#define RL_W3_OFF_MSG_COUNTER  (offsetof(radioWireV3_t, msgCounter_le))
#define RL_W3_OFF_PAYLOAD_LEN  (offsetof(radioWireV3_t, payloadLen))
#define RL_W3_OFF_PAYLOAD      (offsetof(radioWireV3_t, payload))

#define RADIOLINK_WIRE_V3_HDR_LEN_DERIVED \
    (offsetof(radioWireV3_t, payload))

_Static_assert(RADIOLINK_WIRE_V3_HDR_LEN_DERIVED == 11U,
    "Wire v3 header length must be 11 bytes");

Two things make this defensive:

  • The struct is for offsetof() only. The header explicitly warns: do not cast a raw RX buffer to radioWireV3_t *. The struct exists to make the offsets self-documenting and self-checking, not to do unaligned pointer reads off a network buffer.
  • The _Static_assert at the bottom fires at compile time if anybody changes the layout in a way that breaks the 11-byte invariant. You cannot accidentally ship a 12-byte header.

Decoding a real Wire v3 header from the RX log

This is where it gets satisfying. Take the second RXHDR line from the capture earlier in the post:

Plaintext
RXHDR: 03 56 0D 00 00 00 F4 BA 5F 00 31 len=76

Walk it byte by byte against the layout we just defined:

  • Byte 0 — 03 — version. Wire v3. ✓
  • Byte 1 — 56 — nodeId. The TX board is identifying itself as node 0x56 (decimal 86).
  • Bytes 2–5 — 0D 00 00 00 — sessionSeqId_le. Little-endian, so the value is 0x0000000D = 13. This is the 13th boot-epoch since the TX board’s FRAM was last initialized.
  • Bytes 6–9 — F4 BA 5F 00 — msgCounter_le. Little-endian, so the value is 0x005FBAF4 = 6273780. That matches the TX-side log line RL: TX ctr=6273780 exactly, which is how we know the receiver is parsing the field correctly.
  • Byte 10 — 31 — payloadLen. 0x31 = 49 bytes of payload.
  • len=76 — total frame length on air. Plug in the protocol invariant: 11 (header) + 49 (payload) + 16 (CMAC) = 76. ✓

Every field decodes, the math closes, and the parser, the builder, and the receiver all agree on the layout. That is the guarantee the offsetof() machinery is buying us.

What Part 2 will cover

With the architecture, the hardware, and the wire format established, Part 2 will go inside the radioLink protocol layer:

  • The TX pipeline, end to end — from RadioLink_Send() through the AES-CTR encryption, the CMAC tag generation, and the SX1262 hand-off.
  • The RX pipeline — structural validation, header decode, payload-length validation, CMAC verification, replay check, AES-CTR decryption, and delivery to the application.
  • How the STM32F4’s hardware CRYP block is used to do AES-128 in fewer cycles than a software implementation.
  • The replay-protection model — why we use an epoch-per-boot scheme, and why it lets us write to FRAM exactly once per power cycle no matter how many packets we send.

Part 3 will cover the QA harness — turning on RADIOLINK_QA_TEST and deliberately firing malformed, replayed, and oversized frames at the receiver to verify the rejection path.

Project Downloads

The source files used in this tutorial are available for download. This package includes all custom code and documentation required to follow along with the project.

  • Tutorial Source Files (Core and custom modules)
  • STM32CubeIDE Configuration File (.ioc)
  • Linker Scripts and Supporting Files
  • README with setup instructions
  • Separate Doxygen Documentation Download

Note: The source package does not include STM32Cube HAL drivers, middleware, or auto-generated system files. These are provided by STM32CubeIDE when creating a new project.

Documentation

This project includes full Doxygen-generated documentation for all custom source files.

The documentation is provided as a separate download and can be viewed locally by opening the following file in a web browser:

F439_CPP_TX-RX_LoRa_Project_01/docs/html/index.html

The documentation provides detailed descriptions of functions, data structures, and module interactions to assist with understanding and extending the project.

If you have questions or run into trouble getting the boards programmed and talking to each other, post in the Tutorial Support forum and I will work through it with you. If project source is not linked in the tutorial, it may be available on request — use the email contact option in the site footer.