New multimodal release: Idefics3!
Adding vision to Llama 3.1 8b 👀
Strong improvement over April's Idefics2: +14 points on DocVQA, +6 points on MathVista 🧠
Interleave up to 60 images with text! 🤯
Comparable performance to the unreleased Llama 3.1 8B multimodal 🦾
8B-parameters: runs natively in one A100 🤏
Open license: Apache 2.0 🤗
Transparent training data: Ethically sourced datasets, built for the community 🥳Use it today with our branch of transformers: https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3
and our open weights:
Adding vision to Llama 3.1 8b 👀
Strong improvement over April's Idefics2: +14 points on DocVQA, +6 points on MathVista 🧠
Interleave up to 60 images with text! 🤯
Comparable performance to the unreleased Llama 3.1 8B multimodal 🦾
8B-parameters: runs natively in one A100 🤏
Open license: Apache 2.0 🤗
Transparent training data: Ethically sourced datasets, built for the community 🥳Use it today with our branch of transformers: https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3
and our open weights: