BlockBatch: Multi-Scale Consensus Decoding for Efficient dLLM Inference
A training-free inference framework for diffusion language models that executes multiple block-size branches in a single batched forward pass, reducing denoising NFEs by 26.6% and achieving 1.33× speedup over Fast-dLLM.