AUTOXIV · CLUSTER

Machine Learning Optimization Methods.

Research on optimization techniques for machine learning models, including reinforcement learning algorithms, model compression through quantization, and efficient embedding methods for large-scale systems.

9 papers

✨ Talk to this cluster →

Papers.

260421.0040

Bounded Ratio Reinforcement Learning

Ao · Chen · Lee +5

This paper introduces Bounded Ratio Reinforcement Learning (BRRL), a theoretical framework that bridges the gap between trust region methods and PPO's clipped objective, leading to a new algorithm called Bounded Policy Optimization (BPO) that provides monotonic improvement guarantees while matching or exceeding PPO's performance. The framework also extends to Group-relative BPO (GBPO) for large language model fine-tuning.

Formal Sciences 260421.0046

GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling

Dadgarnia · Tabesh · Nikdan +3

GSQ is a new scalar quantization method for large language models that uses Gumbel-Softmax relaxation to jointly optimize grid assignments and scales, achieving accuracy comparable to complex vector quantization methods while remaining compatible with existing inference kernels. It successfully quantizes models to 2-3 bits per parameter and scales to trillion-parameter mixture-of-experts models.

Formal Sciences 260421.0047

A Note on TurboQuant and the Earlier DRIVE/EDEN Line of Work

Ben-Basat · Ben-Itzhak · Mendelson +3

This note demonstrates that TurboQuant, a recent quantization method, is a suboptimal special case of the earlier EDEN/DRIVE quantization schemes. EDEN consistently outperforms TurboQuant across all tested scenarios, often by more than one bit of precision.

Formal Sciences 260421.0070

Spectral bandits for smooth graph functions

Valko · Munos · Kveton +1

This paper introduces a bandit algorithm for learning smooth functions over graphs, where neighboring nodes have similar payoffs. The approach achieves regret bounds that scale with an effective dimension rather than the total number of nodes, enabling efficient online learning for applications like content recommendation with thousands of items from just tens of evaluations.

Formal Sciences 260421.0072

Bridge-Centered Metapath Classification Using R-GCN-VGAE for Disaster-Resilient Maintenance Decisions

Yasuno

This paper develops a graph neural network approach to classify bridges based on their disaster-resilience roles by analyzing metapaths connecting highways, bridges, and critical buildings. The method helps prioritize bridge maintenance budgets by identifying which bridges are essential for supply chains, medical access, or residential protection during disasters.

Formal Sciences 260421.0078

Balanced Co-Clustering of Users and Items for Embedding Table Compression in Recommender Systems

Jiang · Yang · Wu

BACO is a framework that compresses embedding tables in recommender systems by clustering similar users and items to share embeddings, achieving over 75% parameter reduction with minimal accuracy loss. It outperforms existing methods by being up to 346X faster while maintaining recommendation quality.

Formal Sciences 260421.0080

Predictive Modeling of Natural Medicinal Compounds for Alzheimer Disease Using Cheminformatics

Tirmizi · Hasnain · Faris +2

This study develops a machine learning model using molecular descriptors to screen over 7,000 natural compounds for potential anti-Alzheimer activity, identifying 73 promising candidates. The cheminformatics approach demonstrates moderate predictive performance and highlights key molecular features associated with therapeutic potential.

Formal Sciences 260421.0081

Scale-free adaptive planning for deterministic dynamics & discounted rewards

Bartlett · Gabillon · Healey +1

Platypoos is a planning algorithm for deterministic environments with stochastic rewards that automatically adapts to unknown reward scales and smoothness without requiring prior knowledge of discount factors or reward bounds. It achieves optimal sample complexity with matching upper and lower bounds.

Formal Sciences 260421.0085

Block-encodings as programming abstractions: The Eclipse Qrisp BlockEncoding Interface

Petrič · Zander

This paper introduces the BlockEncoding interface in the Eclipse Qrisp framework, which makes block-encoding techniques accessible as high-level programming abstractions for implementing advanced quantum algorithms. The interface simplifies the practical implementation and resource estimation of algorithms like QSVT, QSP, and Hamiltonian simulation.

Formal Sciences