✨ TL;DR
This paper introduces Neural Indicator Sampling (NI Sampling), a method that optimizes the token sampling order in discrete diffusion language models to achieve up to 14.3× speedup over standard sampling while maintaining accuracy. The approach uses a trained neural indicator to intelligently select which tokens to sample at each step, dramatically reducing the number of required sampling iterations.
Discrete diffusion language models (dLLMs) offer advantages over autoregressive models by generating tokens in arbitrary orders with potential for parallel decoding. However, existing sampling strategies are inefficient because they only sample a small subset of tokens at each step using heuristic approaches. This inefficiency leaves significant room for improvement in terms of sampling speed and computational efficiency. The core challenge is determining the optimal order in which to sample tokens to minimize the total number of sampling iterations while maintaining generation quality.
The paper proposes Neural Indicator Sampling (NI Sampling), a framework that uses a learned neural indicator to optimize token sampling order. The key insight is that fully leveraging correct predictions at each step can dramatically reduce sampling iterations. The neural indicator is trained to decide which tokens should be sampled at each step, rather than relying on heuristics. The authors introduce a novel trajectory-preserving objective for training the indicator, which ensures that the learned sampling strategy maintains the quality of the generation process. This approach is designed to be general and applicable across different discrete diffusion models.