AutoXiv

Inference-Time Model Improvement.

Methods that enhance machine learning model performance through strategic inference-time interventions, including selective querying, error correction, latent prediction, bias correction, and interpretable reasoning.

13 papers

Papers.

260421.0039
Sessa: Selective State Space Attention
Horbatko
Sessa is a new sequence model that places attention inside a recurrent feedback path, enabling power-law memory decay instead of exponential or 1/length dilution. This architecture achieves superior long-context performance while remaining competitive on short sequences.
Formal Sciences
260421.0043
Revisiting Active Sequential Prediction-Powered Mean Estimation
Sfyraki · Wang
This paper analyzes active sequential prediction-powered mean estimation, where labels are selectively queried and ML predictions fill in the gaps. The authors find that contrary to intuition, using a nearly constant query probability (ignoring uncertainty) often produces tighter confidence intervals than adaptive uncertainty-based querying.
Formal Sciences
260421.0044
Latent Phase-Shift Rollback: Inference-Time Error Correction via Residual Stream Monitoring and KV-Cache Steering
Gupta · Kumar
LPSR is an inference-time error correction method that monitors internal model activations to detect reasoning mistakes, then rolls back generation and steers the model using cached corrections—no training required. It improves an 8B model's math accuracy from 28.8% to 44.0%, outperforming prompted self-correction and even larger 70B models.
Formal Sciences
260421.0045
Benchmarking System Dynamics AI Assistants: Cloud Versus Local LLMs on CLD Extraction and Discussion
Leitch
This paper benchmarks cloud and local large language models on two System Dynamics tasks: extracting causal loop diagrams and providing interactive coaching. The best local models match mid-tier cloud performance on diagram extraction (77%) but struggle with long-context error-fixing tasks, with backend implementation choices mattering more than quantization levels.
Formal Sciences
260421.0055
Barrier-enforced multi-objective optimization for direct point and sharp interval forecasting
Amnuaypongsa · Suparanonrat · Wanitchollakit +1
This paper presents a neural network framework that simultaneously generates point forecasts and prediction intervals for multi-step time series forecasting, using multi-objective optimization to automatically balance forecast accuracy and interval sharpness while guaranteeing non-crossing intervals and target coverage. The method eliminates manual hyperparameter tuning and demonstrates superior performance on solar irradiance forecasting compared to existing approaches.
Formal Sciences
260421.0059
Multi-Scale Reversible Chaos Game Representation: A Unified Framework for Sequence Classification
Ali · Murad
MS-RCGR is a new method that converts biological sequences (DNA/protein) into multi-resolution geometric images without losing information, enabling better classification through traditional ML, computer vision, or hybrid approaches. The method consistently improves performance across different analysis paradigms and achieves best results when combined with protein language models.
Formal Sciences
260421.0061
NI Sampling: Accelerating Discrete Diffusion Sampling by Token Order Optimization
Liu · Ning · Wang +1
This paper introduces Neural Indicator Sampling (NI Sampling), a method that optimizes the token sampling order in discrete diffusion language models to achieve up to 14.3× speedup over standard sampling while maintaining accuracy. The approach uses a trained neural indicator to intelligently select which tokens to sample at each step, dramatically reducing the number of required sampling iterations.
Formal Sciences
260421.0063
Semantic Step Prediction: Multi-Step Latent Forecasting in LLM Reasoning Trajectories via Step Sampling
Yuan
This paper shows that applying Semantic Tube Prediction (STP) at reasoning step boundaries instead of random token positions dramatically improves multi-step latent prediction in LLMs (168x vs 4x improvement), revealing that sampling position is critical for geometric regularization of reasoning trajectories.
Formal Sciences
260421.0071
Knowing When to Quit: A Principled Framework for Dynamic Abstention in LLM Reasoning
Davidov · Cohen · Kalinsky +4
This paper develops a principled reinforcement learning framework for deciding when language models should stop generating text mid-response to avoid wasting compute on incorrect outputs. The approach uses value function estimation to determine optimal stopping points, improving accuracy-compute tradeoffs compared to existing methods.
Formal Sciences
260421.0075
Forecasting Ionospheric Irregularities on GNSS Lines of Sight Using Dynamic Graphs with Ephemeris Conditioning
Turkmen · Tan · Lee
This paper introduces IonoDGNN, a dynamic graph neural network that forecasts ionospheric irregularities by modeling satellite pierce points as graph nodes with time-varying connectivity, conditioned on predictable future satellite positions. The approach outperforms persistence baselines by 35-52% and enables forecasting on lines of sight that don't yet exist in the observation period.
Formal Sciences
260421.0079
Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference
Arruda · Chervet · Staudt +6
This paper develops a simulation-based Bayesian inference method that corrects for selection bias by embedding the selection mechanism directly into the generative model, enabling accurate parameter estimation in complex models where traditional likelihood-based approaches fail. The approach allows both debiased estimation and explicit testing for bias presence without requiring tractable likelihoods.
Formal Sciences
260421.0082
Symmetry Guarantees Statistic Recovery in Variational Inference
Marks · Paccagnan · Wilk
This paper develops a general mathematical theory explaining how symmetries in target distributions and variational families guarantee that variational inference can accurately recover certain statistics, even when the approximation is imperfect. The framework unifies existing results and enables derivation of new recovery guarantees across diverse settings including directional statistics.
Formal Sciences
260421.0083
CAARL: In-Context Learning for Interpretable Co-Evolving Time Series Forecasting
Tajeuna · Owusu · Brun +1
CAARL uses large language models to forecast co-evolving time series by converting temporal dependencies into narrative form, enabling interpretable chain-of-thought reasoning for predictions. The approach decomposes series into autoregressive segments and builds dependency graphs that LLMs can process as text, achieving competitive accuracy with enhanced transparency.
Formal Sciences