The Embassy
Runtime
Understanding what happens before your main() runs demystifies crashes during initialisation and helps you understand why Embassy's setup code looks the way it does.
RP2350 BOOT SEQUENCE ───────────────────────────────────────────────────── 1. ROM Boot Built-in mask ROM executes immediately on power. Validates and loads the 256-byte boot2 stage. 2. boot2 Tiny assembler stub that configures the flash interface (RP2350 runs XIP — eXecute In Place from flash). Embassy provides the correct boot2 for W25Q16. 3. cortex-m-rt Runtime library: zeroes .bss, copies .data to RAM, initialises the vector table, sets up the stack pointer. Then calls your Rust main(). 4. embassy_rp::init() Configures clocks (125MHz default), DMA, interrupt controller, and returns the Peripherals singleton. This is the first line of every Embassy main(). 5. Your async fn main() Gets the Spawner — the handle to spawn tasks. The Embassy executor starts here and never returns.
Embassy's executor is a run-to-completion cooperative scheduler. It does not preempt tasks. A task runs until it yields (via .await). When it yields, the executor checks the run queue for other ready tasks, runs them, then — if the queue is empty — executes WFI (Wait For Interrupt) to sleep the CPU. A hardware interrupt fires, potentially waking one or more tasks via their registered wakers, and the executor loop continues.
EMBASSY EXECUTOR LOOP ─────────────────────────────────────────────────────────── static RUN_QUEUE: [Task;N] ← statically allocated in task arena loop: while run_queue.not_empty(): task = run_queue.dequeue() match task.poll(): Ready → mark task done, free arena slot Pending → task registered waker with hardware task waits until waker fires // Queue empty — sleep until hardware event cortex_m::asm::wfi() ← CPU clock gates, ~µA idle current // Timer ISR fires → waker.wake() → task moves to run_queue // GPIO ISR fires → waker.wake() → task moves to run_queue // I2C ISR fires → waker.wake() → task moves to run_queue // ... repeat ...
Embassy is cooperative — tasks must yield voluntarily.
In a preemptive RTOS (FreeRTOS, Zephyr), the scheduler can interrupt a task at any point and switch to another. This requires saving and restoring the full CPU register state at every switch — the context switch. In Embassy's cooperative model, tasks switch only at .await points. The "context switch" is just the Rust state machine storing its variables in the enum and returning Poll::Pending. There is no register save. There is no stack push. The "switch" costs a few nanoseconds.
The trade-off: a task that never awaits starves all other tasks. This is why blocking in async is a critical error in Embassy. Every significant operation must be .awaited. Embassy's API is designed to make this natural — all I/O is async, all delays are async, all inter-task communication is async.
#[embassy_executor::task] async fn display_task( clk: Output<'static>, dio: Output<'static>, // arguments must be 'static — they live for the program's lifetime ) { let mut display = Tm1637::new(clk, dio); loop { display.show_number(1234); Timer::after_millis(100).await; } } #[embassy_executor::task] async fn sensor_task() { loop { // read DHT11, send to shared state Timer::after_secs(2).await; } } #[embassy_executor::main] async fn main(spawner: Spawner) { let p = embassy_rp::init(Default::default()); // Spawn concurrent tasks — they run interleaved at await points spawner.spawn(display_task( Output::new(p.PIN_2, Level::Low), // GPIO2 = CLK Output::new(p.PIN_3, Level::Low), // GPIO3 = DIO )).unwrap(); spawner.spawn(sensor_task()).unwrap(); // main() itself is a task — can continue doing work loop { Timer::after_secs(60).await; defmt::info!("heartbeat — system running"); } } // Why 'static? Because Embassy tasks are stored in a static array. // The arena outlives every function call — it lives for the program. // A task that holds a reference to a local variable would dangle. // 'static guarantees the data lives as long as the task does. // Why unwrap()? Embassy's task arena is fixed-size. If you spawn // more tasks than the arena has space for, spawn() returns Err. // unwrap() panics — a good thing to see early in development. // In production, size the arena to fit all your tasks.
Embassy wraps every RP2350 peripheral in a singleton type. embassy_rp::init() returns a Peripherals struct with one field per peripheral. Each field can only be moved out once — into the task or driver that owns it. If you try to use PIN_2 in two places, the compiler refuses: PIN_2 was moved in the first use, it cannot be used again.
This is hardware ownership at compile time. No runtime peripheral manager. No mutex protecting a shared peripheral handle. The compiler guarantees that exactly one piece of code drives each GPIO pin, each I2C bus, each PWM slice. The firmware cannot be written in a way that two drivers simultaneously drive the same pin — the type system prevents it.
§ 6.5RP2350 MEMORY MAP (520KB SRAM) ───────────────────────────────────────────────────────── 0x20000000 ┌─────────────────────────────┐ ← RAM start │ .data section │ initialized statics │ (copied from flash by rt) │ ~few KB ├─────────────────────────────┤ │ .bss section │ zero-initialized statics │ (zeroed by runtime) │ ~few KB ├─────────────────────────────┤ │ Embassy task arena │ task state machines │ (EMBASSY_EXECUTOR_...) │ size depends on tasks │ │ default ~4KB ├─────────────────────────────┤ │ Your heap (optional) │ if using alloc crate │ (off by default) │ ├─────────────────────────────┤ │ ↑ Stack grows downward │ │ ... │ │ Stack │ Cortex-M33 main stack 0x200827FF └─────────────────────────────┘ ← RAM end (520KB) flip-link moves the stack to the BOTTOM of RAM (before .data/.bss). If the stack overflows into .data, flip-link's sentinel triggers a hard fault immediately — instead of silently corrupting data. This turns mysterious data corruption bugs into immediate crashes. Always use flip-link in development.
Run concurrent tasks, verify interleaving
Write three tasks: a heartbeat task (logs "tick" every second), a counter task (increments a shared counter every 100ms), and a display task (reads the counter every 250ms and shows it on the TM1637). Use an embassy_sync::mutex::Mutex to protect the counter. Observe that the TM1637 display shows approximately 2-3 increments between updates. Verify you can see all three tasks' log output interleaved in the terminal.
Find the size of your task state machines
Add this to your main: defmt::info!("arena: {} bytes", core::mem::size_of::<YourTaskFn>()) where YourTaskFn is the return type of your async task function. You can also check the linker map file after building: cargo build --release 2>&1 | grep -i embassy or examine the .elf with probe-rs. How much arena space does each task take? Does a task with a large local buffer take more arena space than a simple task?