Async Rust —
Futures, Executors,
and the Poll Model
Most programmers first encounter async through a framework that hides the mechanism — Python asyncio, Node.js, Django Channels, Go goroutines. In Rust, you must understand the mechanism because Rust exposes it explicitly, and code that ignores it breaks in subtle ways.
The core insight: async/await in Rust is a compile-time transformation, not a runtime feature. When you write async fn, the compiler does not create a thread or anything involving the OS. It identifies every .await point, identifies which local variables must survive across each await, and generates a struct — a state machine — that stores exactly those variables and implements the Future trait. That Future is then handed to an executor, which drives it to completion.
An async function's state is a struct — no heap required.
In Node.js every async function creates a Promise on the heap. In Python every coroutine is a heap-allocated object with its own frame. In Go every goroutine has a 2KB heap stack. These accumulate and require GC.
In Rust, an async function's state machine is sized at compile time. It can live on the stack, in a static array, or on the heap — wherever the caller places it. Embassy uses a static arena. Tokio uses the heap. The Future has no opinion about where it lives. This is why Embassy runs concurrent tasks on a Pico 2 with no heap allocator.
A Future is any type implementing:
pub trait Future { type Output; // type produced when complete fn poll( self: Pin<&mut Self>, cx: &mut Context<'_>, ) -> Poll<Self::Output>; } enum Poll<T> { Ready(T), // done — here is the result Pending, // not done — wake me when conditions change }
The Context<'_> carries a Waker. When the thing the Future is waiting for happens — a timer fires, a GPIO edge arrives — it calls waker.wake(), which tells the executor to poll this task again. This is the complete mechanism: poll → Pending + register waker → event fires → waker.wake() → executor repollls → Ready.
THE POLL-WAKEUP CYCLE ─────────────────────────────────────────────── Executor Task Hardware poll(cx) ──────────▶ Not ready yet. Registers cx.waker() with timer hardware. Pending ◀────────── (runs other tasks) Timer fires waker.wake() poll(cx) ──────────▶ Timer elapsed. Continuing work. Ready(v) ◀────────── On the Pico: WFI instruction between wakeups. CPU clock gates entirely. Near-zero idle power.
// What you write: async fn blink(mut led: Output<'static>) { loop { led.set_high(); Timer::after_millis(500).await; // await A led.set_low(); Timer::after_millis(500).await; // await B } } // What the compiler conceptually generates: // (actual output is more complex but this is the mental model) enum BlinkSM { Start { led: Output<'static> }, AwaitA { led: Output<'static>, timer: TimerFuture }, AwaitB { led: Output<'static>, timer: TimerFuture }, } // Size = max(variant sizes) = known at compile time. // Embassy's task arena allocates exactly sizeof(BlinkSM) bytes. // No heap. No GC. No surprise allocation at runtime.
// ── EMBASSY: no_std, no heap, static tasks ── #![no_std] #![no_main] #[embassy_executor::task] async fn blink_task(led: PIN_25) { /* ... */ } #[embassy_executor::main] async fn main(spawner: Spawner) { let p = embassy_rp::init(Default::default()); spawner.spawn(blink_task(p.PIN_25)).unwrap(); // Tasks: statically allocated in fixed-size arena // Wakeups: hardware ISRs via embassy_rp interrupt handlers // Sleep: WFI — CPU gates clock until next interrupt } // ── TOKIO: std, heap, thread pool ── #[tokio::main] async fn main() { tokio::spawn(handle_request()); // boxed on heap tokio::spawn(stream_alerts()); // boxed on heap // Tasks: Box<dyn Future> on the heap, work-stealing thread pool // Wakeups: OS epoll (Linux) / kqueue (macOS) / IOCP (Windows) // Sleep: OS thread yield — scheduler decides next task } // KEY RULES FOR BOTH: // 1. Never block inside async — use .await or spawn_blocking // 2. Never hold a sync Mutex across an .await // 3. Drop Futures correctly — cancellation may leave state incomplete
You noticed Pin<&mut Self> in the Future trait. A Pin guarantees the value will not move in memory after it is first polled. This exists because async state machines can contain self-referential data — a local variable and a reference pointing into it. If the struct moved, the reference would become a dangling pointer. Pin prevents movement. The executor promises to give the Future the same memory address on every subsequent poll call.
In practice you almost never write Pin directly. The .await syntax handles it. tokio::pin!() and Embassy's pin_mut!() handle it when you need to poll the same Future across multiple select! calls. The rule to remember: once a Future is polled, never move it.
§ 4.6Embassy is single-threaded per core. One blocking call — a spin-wait, a blocking delay, a blocking I2C write — freezes the entire executor for its duration. No other tasks run. Use Timer::after_millis(n).await not cortex_m::asm::delay(). Use embassy_rp's async I2C, not blocking. In Tokio, use tokio::task::spawn_blocking(|| expensive_work()) for any CPU-bound or blocking-I/O work.
// WRONG — std Mutex held across await = potential deadlock async fn bad(m: Arc<std::sync::Mutex<State>>) { let mut g = m.lock().unwrap(); do_async_work().await; // lock held while yielded — deadlock if another task needs it g.x += 1; } // CORRECT option A — async work first, then lock briefly async fn good_a(m: Arc<std::sync::Mutex<State>>) { let result = do_async_work().await; // async first m.lock().unwrap().x += result; // brief lock, no await inside } // CORRECT option B — use tokio::sync::Mutex (yields instead of blocking) async fn good_b(m: Arc<tokio::sync::Mutex<State>>) { let mut g = m.lock().await; // yields while waiting — no deadlock g.x += 1; } // In Embassy: embassy_sync::mutex::Mutex — same pattern
When you select() two Futures and one resolves first, the other is dropped. If that dropped Future had partially completed some operation — sent half a command, incremented a counter — the partial state remains. Always structure operations so that dropping them mid-execution leaves the system consistent. Embassy I/O futures are generally cancellation-safe. Complex application futures require careful design.
§ 4.7Implement Future manually for a countdown
Without using async/await, implement Future for a CountdownFuture struct that resolves after N polls. In poll(): if count is zero, return Ready(()). Otherwise decrement, call cx.waker().wake_by_ref(), and return Pending. Build a simple blocking spin executor to drive it. After this exercise, you will understand exactly what .await desugars to — it is not magic, it is this loop.
Producer/consumer with a shared counter
Write a producer task that increments a shared counter every 100ms, and a consumer task that reads it every 500ms and displays it on your TM1637. Use embassy_sync::mutex::Mutex. Verify neither task blocks the other, and that you never hold the mutex across an await point. The counter should read approximately 5× the increment rate of the display update rate.
Non-blocking event multiplexing
Using Embassy's select(), write a task that waits for either a GPIO button press on PIN_15 or a 5-second timeout. Display "btn" or "time" on the TM1637 to indicate which arrived first. This is the fundamental embassy pattern for non-blocking event handling — used in every real-world firmware that reacts to multiple independent events.