Click to view multi-image results on Mantis Eval, BLINK V... | Click to view multi-image results on Mantis Eval, BLINK V...
Click to view multi-image results on Mantis Eval, BLINK Val, Mathverse mv, Sciverse mv, MIRB.
Click to view video results on Video-MME and Video-ChatGPT.
Click to view few-shot results on TextVQA, VizWiz, VQAv2, OK-VQA.
Examples
https://github.com/OpenBMB/MiniCPM-V/raw/main/assets/minicpmv2_6/multi_img-bike.png
https://github.com/OpenBMB/MiniCPM-V/raw/main/assets/minicpmv2_6/multi_img-code.png
https://github.com/OpenBMB/MiniCPM-V/raw/main/assets/minicpmv2_6/ICL-Mem.png
Click to view more cases.
We deploy MiniCPM-V 2.6 on end devices. The demo video is the raw screen recording on a iPad Pro without edition.
https://github.com/OpenBMB/MiniCPM-V/raw/main/assets/gif_cases/ai.gif
https://github.com/OpenBMB/MiniCPM-V/raw/main/assets/gif_cases/ticket.gif
https://cdn-uploads.huggingface.co/production/uploads/64abc4aa6cadc7aca585dddf/mXAEFQFqNd4nnvPk7r5eX.mp4
Demo
Click here to try the Demo of MiniCPM-V 2.6.