LLaVA-Mini👏is an efficient LMM for image/video understanding using one vision token, offering: (1)⏩lower latency (fast as 40ms per image), (2)🖥️less VRAM usage (support 3-hour video understanding on 2 - x - news.news