AUTOXIV · CLUSTER
Efficient LLM Training.
Research on improving reinforcement learning, reasoning generalization, and optimization efficiency for training and fine-tuning large language models under resource constraints.
13 papers
Papers.
260421.0038Formal Sciences260421.0041Formal Sciences260421.0042Formal Sciences260421.0049Formal Sciences260421.0051Formal Sciences260421.0052Formal Sciences260421.0054Formal Sciences260421.0056Formal Sciences260421.0060Formal Sciences260421.0067Formal Sciences260421.0068Formal Sciences260421.0074Formal Sciences260421.0087Formal Sciences
MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval
When Can LLMs Learn to Reason with Weak Supervision?
Back into Plato's Cave: Examining Cross-modal Representational Convergence at Scale
FUSE: Ensembling Verifiers with Zero Labeled Data
Duality for the Adversarial Total Variation
IDOBE: Infectious Disease Outbreak forecasting Benchmark Ecosystem
Too Correct to Learn: Reinforcement Learning on Saturated Reasoning Data
Faster by Design: Interactive Aerodynamics via Neural Surrogates Trained on Expert-Validated CFD
Train Separately, Merge Together: Modular Post-Training with Mixture-of-Experts
AutoPPA: Automated Circuit PPA Optimization via Contrastive Code-based Rule Library Learning
ProtoCLIP: Prototype-Aligned Latent Refinement for Robust Zero-Shot Chest X-Ray Classification
Learning from Less: Measuring the Effectiveness of RLVR in Low Data and Compute Regimes
Universally Empowering Zeroth-Order Optimization via Adaptive Layer-wise Sampling