✨ TL;DR
BACO is a framework that compresses embedding tables in recommender systems by clustering similar users and items to share embeddings, achieving over 75% parameter reduction with minimal accuracy loss. It outperforms existing methods by being up to 346X faster while maintaining recommendation quality.
Modern recommender systems rely on dense embedding vectors for users and items, but at industrial scale these embedding tables require enormous numbers of parameters. This creates substantial computational and memory overhead during both training and inference, making deployment difficult under resource constraints. Existing compression approaches face a critical trade-off: they either severely degrade recommendation accuracy or require prohibitively high computational costs, making them impractical for real-world applications.
BACO compresses embeddings by exploiting collaborative signals in user-item interactions to group similar users and items that can share the same embeddings from a smaller codebook. The method formulates a balanced co-clustering objective that maximizes connectivity within clusters while maintaining balanced cluster sizes to prevent codebook collapse. The framework unifies canonical graph clustering techniques and implements a principled weighting scheme for users and items, an efficient label propagation solver, and introduces secondary user clusters to produce effective groupings while avoiding degenerate solutions.