LLaVA-Mini👏is an efficient LMM for image/video understanding using one vision token, offering:(1)⏩lower latency (fast as 40ms per image), (2)🖥️less VRAM usage (support 3-hour video understanding on 24GB GPU).Paper: Code & Demo: