I am a CS PhD Student at UT Austin advised by Prof. Inderjit Dhillon. I also work closely with Prateek Jain at Google Deepmind. My current research interests include End-to-end Information Retrieval and its applications to Efficient Large Language Models. Before joining my PhD, I spent 2 years at Microsoft Research India working with Dr. Manik Varma on algorithms and applications of Extreme Classification. I completed my undergraduate with Honours in CS from IIT Bombay.

Interning at Google Research New York with Felix Yu for Fall 2024 on In-context Information Retrieval with LLMs

Publications

Exploring Design Choices for Building Language-Specific LLMs

This paper examines how adapting LLMs with vocabulary extension and pretraining improves efficiency and performance across languages

EHI: End-to-end Learning of Hierarchical Index for Efficient Dense Retrieval

About Dense embedding-based retrieval is widely used for semantic search and ranking. However, conventional two-stage approaches, involving contrastive embedding learning followed by approximate nearest neighbor search (ANNS), can suffer from misalignment between these stages. This mismatch degrades retrieval performance. We propose End-to-end Hierarchical Indexing (EHI), a novel method that directly addresses this issue by jointly optimizing embedding generation and ANNS structure. EHI leverages a dual encoder for embedding queries and documents while simultaneously learning an inverted file index (IVF)-style tree structure....

Paper

Bibtex

OpenReview

Dual-encoders for Extreme Multi-label Classification

A parameter efficient encoder only model for multi-shot retrieval (aka extreme classification)

NGAME: Negative Mining aware Mini-batching for Extreme Classification

A light-weight mini-batch creation technique that offers provably accurate in-batch negative samples for training retrieval models. This allows training with larger mini-batches offering significantly faster convergence and higher accuracies than existing negative sampling techniques.

Paper

Code

Bibtex

ELIAS: End-to-end Learning to Search and Index in Large Output Spaces

Learnable graph-based search index for classification/retrieval in large output space, scalable to label space on a single A100 GPU, achieves SOTA on multiple large-scale extreme classification benchmarks

Generalized Zero-shot Extreme Multi-label Classification

This paper proposes Generalized Zero-shot XML (GZXML), a paradigm where the task is to tag a data point with the most relevant labels from a large universe of both seen and unseen labels.

Paper

Code

Bibtex

Extreme Regression for Dynamic Search Advertising

This paper introduces a new learning paradigm called eXtreme Regression (XR) whose objective is to accurately predict the numerical degrees of relevance of an extremely large number of labels to a data point. XR can provide elegant solutions to many large-scale ranking and recommendation applications including Dynamic Search Advertising (DSA).

Paper

Code

Bibtex