AutoXiv

Machine Learning Optimization Methods.

Research on optimization techniques for machine learning models, including reinforcement learning algorithms, model compression through quantization, and efficient embedding methods for large-scale systems.

9 papers

Papers.

260421.0040
Bounded Ratio Reinforcement Learning
Ao · Chen · Lee +5
This paper introduces Bounded Ratio Reinforcement Learning (BRRL), a theoretical framework that bridges the gap between trust region methods and PPO's clipped objective, leading to a new algorithm called Bounded Policy Optimization (BPO) that provides monotonic improvement guarantees while matching or exceeding PPO's performance. The framework also extends to Group-relative BPO (GBPO) for large language model fine-tuning.
Formal Sciences
260421.0046
GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling
Dadgarnia · Tabesh · Nikdan +3
GSQ is a new scalar quantization method for large language models that uses Gumbel-Softmax relaxation to jointly optimize grid assignments and scales, achieving accuracy comparable to complex vector quantization methods while remaining compatible with existing inference kernels. It successfully quantizes models to 2-3 bits per parameter and scales to trillion-parameter mixture-of-experts models.
Formal Sciences
260421.0047
A Note on TurboQuant and the Earlier DRIVE/EDEN Line of Work
Ben-Basat · Ben-Itzhak · Mendelson +3
This note demonstrates that TurboQuant, a recent quantization method, is a suboptimal special case of the earlier EDEN/DRIVE quantization schemes. EDEN consistently outperforms TurboQuant across all tested scenarios, often by more than one bit of precision.
Formal Sciences
260421.0070
Spectral bandits for smooth graph functions
Valko · Munos · Kveton +1
This paper introduces a bandit algorithm for learning smooth functions over graphs, where neighboring nodes have similar payoffs. The approach achieves regret bounds that scale with an effective dimension rather than the total number of nodes, enabling efficient online learning for applications like content recommendation with thousands of items from just tens of evaluations.
Formal Sciences
260421.0072
Bridge-Centered Metapath Classification Using R-GCN-VGAE for Disaster-Resilient Maintenance Decisions
Yasuno
This paper develops a graph neural network approach to classify bridges based on their disaster-resilience roles by analyzing metapaths connecting highways, bridges, and critical buildings. The method helps prioritize bridge maintenance budgets by identifying which bridges are essential for supply chains, medical access, or residential protection during disasters.
Formal Sciences
260421.0078
Balanced Co-Clustering of Users and Items for Embedding Table Compression in Recommender Systems
Jiang · Yang · Wu
BACO is a framework that compresses embedding tables in recommender systems by clustering similar users and items to share embeddings, achieving over 75% parameter reduction with minimal accuracy loss. It outperforms existing methods by being up to 346X faster while maintaining recommendation quality.
Formal Sciences
260421.0080
Predictive Modeling of Natural Medicinal Compounds for Alzheimer Disease Using Cheminformatics
Tirmizi · Hasnain · Faris +2
This study develops a machine learning model using molecular descriptors to screen over 7,000 natural compounds for potential anti-Alzheimer activity, identifying 73 promising candidates. The cheminformatics approach demonstrates moderate predictive performance and highlights key molecular features associated with therapeutic potential.
Formal Sciences
260421.0081
Scale-free adaptive planning for deterministic dynamics & discounted rewards
Bartlett · Gabillon · Healey +1
Platypoos is a planning algorithm for deterministic environments with stochastic rewards that automatically adapts to unknown reward scales and smoothness without requiring prior knowledge of discount factors or reward bounds. It achieves optimal sample complexity with matching upper and lower bounds.
Formal Sciences
260421.0085
Block-encodings as programming abstractions: The Eclipse Qrisp BlockEncoding Interface
Petrič · Zander
This paper introduces the BlockEncoding interface in the Eclipse Qrisp framework, which makes block-encoding techniques accessible as high-level programming abstractions for implementing advanced quantum algorithms. The interface simplifies the practical implementation and resource estimation of algorithms like QSVT, QSP, and Hamiltonian simulation.
Formal Sciences