Llm | Nilesh Gupta

LLM-guided Hierarchical Retrieval

LATTICE turns retrieval into an LLM-driven navigation problem over a semantic scaffold for computational tractability needed for large corpora.

BlockRank imposes blockwise sparse attention and leverages query-token attention signals for efficient in-context ranking

This paper examines how adapting LLMs with vocabulary extension and pretraining improves efficiency and performance across languages