xlite-dev/Awesome-LLM-Inference
A comprehensive, curated list of research papers and associated code for optimizing Large Language Model (LLM) and Vision Language Model (VLM) inference.
Core Features
Quick Start
python3 download_pdfs.pyDetailed Introduction
Awesome-LLM-Inference is an invaluable open-source repository meticulously curating a vast collection of research papers and their corresponding code implementations focused on enhancing the efficiency and performance of Large Language Model (LLM) and Vision Language Model (VLM) inference. It serves as a central hub for researchers and practitioners seeking to delve into advanced topics such as attention mechanisms, quantization, parallelism, and KV cache optimizations. The project aims to democratize access to cutting-edge advancements in LLM inference, offering a structured overview of the field's most impactful contributions.