Nilesh Gupta

I am a CS PhD Student at UT Austin advised by Prof. Inderjit Dhillon. I also work closely with Prateek Jain at Google Deepmind. My current research interests include End-to-end Information Retrieval and its applications to Efficient Large Language Models. Before joining my PhD, I spent 2 years at Microsoft Research India working with Dr. Manik Varma on algorithms and applications of Extreme Classification. I completed my undergraduate with Honours in CS from IIT Bombay.
  • Interning at Google Research New York with Felix Yu for Fall 2024 on In-context Information Retrieval with LLMs


Publications

Exploring Design Choices for Building Language-Specific LLMs

This paper examines how adapting LLMs with vocabulary extension and pretraining improves efficiency and performance across languages

EMNLP 2024 · 1 min · Atula Tejaswi*, Nilesh Gupta*, Eunsol Choi · 

EHI: End-to-end Learning of Hierarchical Index for Efficient Dense Retrieval

About Dense embedding-based retrieval is widely used for semantic search and ranking. However, conventional two-stage approaches, involving contrastive embedding learning followed by approximate nearest neighbor search (ANNS), can suffer from misalignment between these stages. This mismatch degrades retrieval performance. We propose End-to-end Hierarchical Indexing (EHI), a novel method that directly addresses this issue by jointly optimizing embedding generation and ANNS structure. EHI leverages a dual encoder for embedding queries and documents while simultaneously learning an inverted file index (IVF)-style tree structure....

TMLR 2024 · 1 min · Ramnath Kumar, Anshul Mittal, Nilesh Gupta, Aditya Kusupati, Inderjit S. Dhillon, Prateek Jain · 

Dual-encoders for Extreme Multi-label Classification

A parameter efficient encoder only model for multi-shot retrieval (aka extreme classification)

ICLR 2024 · 2 min · Nilesh Gupta, Devvrit Khatri, Srinadh Bhojanapalli, Ankit S. Rawat, Prateek Jain, Inderjit S. Dhillon · 

NGAME: Negative Mining aware Mini-batching for Extreme Classification

A light-weight mini-batch creation technique that offers provably accurate in-batch negative samples for training retrieval models. This allows training with larger mini-batches offering significantly faster convergence and higher accuracies than existing negative sampling techniques.

WSDM 2023 · 2 min · Kunal Dahiya*, Nilesh Gupta, Deepak Saini, Akshay Soni, Yajun Wang, Kushal Dave, Jian Jiao, Gururaj K, Prasenjit Dey, Amit Singh, Deepesh Hada, Vidit Jain, Bhawna Paliwal, Anshul Mittal, Sonu Mehta, Ramachandran Ramjee, Sumeet Agarwal, Purushottam Kar, Manik Varma · 

ELIAS: End-to-end Learning to Search and Index in Large Output Spaces

Learnable graph-based search index for classification/retrieval in large output space, scalable to label space on a single A100 GPU, achieves SOTA on multiple large-scale extreme classification benchmarks

NeurIPS 2022 · 2 min · Nilesh Gupta, Patrick H. Chen, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S. Dhillon · 

Generalized Zero-shot Extreme Multi-label Classification

This paper proposes Generalized Zero-shot XML (GZXML), a paradigm where the task is to tag a data point with the most relevant labels from a large universe of both seen and unseen labels.

KDD 2021 · 2 min · Nilesh Gupta, Sakina Bohra, Yashoteja Prabhu, Saurabh Purohit, Manik Varma · 

Extreme Regression for Dynamic Search Advertising

This paper introduces a new learning paradigm called eXtreme Regression (XR) whose objective is to accurately predict the numerical degrees of relevance of an extremely large number of labels to a data point. XR can provide elegant solutions to many large-scale ranking and recommendation applications including Dynamic Search Advertising (DSA).

WSDM 2020 · 2 min · Yashoteja Prabhu, Aditya Kusupati, Nilesh Gupta, Manik Varma ·