Scalable In-context Ranking with Generative Models

BlockRank imposes blockwise sparse attention and leverages query-token attention signals for efficient in-context ranking

NeurIPS 2025 · 2 min · Nilesh Gupta, Chong You, Srinadh Bhojanapalli, Sanjiv Kumar, Inderjit S. Dhillon, Felix Yu · 

ELIAS: End-to-end Learning to Search and Index in Large Output Spaces

Learnable graph-based search index for classification/retrieval in large output space, scalable to label space on a single A100 GPU, achieves SOTA on multiple large-scale extreme classification benchmarks

NeurIPS 2022 · 2 min · Nilesh Gupta, Patrick H. Chen, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S. Dhillon ·