Measure how self-attention latency explodes as you stretch the context window. Run the vanilla O(N²) kernel, then flip on optimizations like FlashAttention-inspired tiling or sparse patterning to see how much strain they relieve.
Benchmark log ready. Press “Run Benchmarks” to begin.
Note: simplified models for illustration; not hardware-accurate.