Share and discover more about AI with social posts from the community.huggingface/OpenAi
💪 Strong OCR Capability and Others. MiniCPM-V 2.6 can process images with any aspect ratio and up to 1.8 million pixels (e.g., 1344x1344). It achieves state-of-the-art performance on OCRBench, surpassing proprietary models such as GPT-4o, GPT-4V, and Gemini 1.5 Pro. Based on the the latest RLAIF-V and VisCPM techniques, it features trustworthy behaviors, with significantly lower hallucination rates than GPT-4o and GPT-4V on Object HalBench, and supports multilingual capabilities on English, Chinese, German, French, Italian, Korean, etc.
🎬 Video Understanding. MiniCPM-V 2.6 can also accept video inputs, performing conversation and providing dense captions for spatial-temporal information. It outperforms GPT-4V, Claude 3.5 Sonnet and LLaVA-NeXT-Video-34B on Video-MME with/without subtitles.
🖼 Multi Image Understanding and In-context Learning. MiniCPM-V 2.6 can also perform conversation and reasoning over multiple images. It achieves state-of-the-art performance on popular multi-image benchmarks such as Mantis-Eval, BLINK, Mathverse mv and Sciverse mv, and also shows promising in-context learning capability.
🔥 Leading Performance. MiniCPM-V 2.6 achieves an average score of 65.2 on the latest version of OpenCompass, a comprehensive evaluation over 8 popular benchmarks. With only 8B parameters, it surpasses widely used proprietary models like GPT-4o mini, GPT-4V, Gemini 1.5 Pro, and Claude 3.5 Sonnet for single image understanding.
Level MLLM for Single Image, Multi Image and Video on Your Phone
GitHub | Demo
MiniCPM-V 2.6
MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series. The model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters. It exhibits a significant performance improvement over MiniCPM-Llama3-V 2.5, and introduces new features for multi-image and video understanding. Notable features of MiniCPM-V 2.6 include:
GitHub | Demo
MiniCPM-V 2.6
MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series. The model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters. It exhibits a significant performance improvement over MiniCPM-Llama3-V 2.5, and introduces new features for multi-image and video understanding. Notable features of MiniCPM-V 2.6 include:
New SmolLM-1.7B-Instruct
SmolLM is a series of small language models available in three sizes: 135M, 360M, and 1.7B parameters.
These models are pre-trained on SmolLM-Corpus, a curated collection of high-quality educational and synthetic data designed for training LLMs. For further details, we refer to our blogpost.
To build SmolLM-Instruct, we finetuned the base models on publicly available datasets.
https://huggingface.co/HuggingFaceTB/SmolLM-1.7B-Instruct-v0.2
SmolLM is a series of small language models available in three sizes: 135M, 360M, and 1.7B parameters.
These models are pre-trained on SmolLM-Corpus, a curated collection of high-quality educational and synthetic data designed for training LLMs. For further details, we refer to our blogpost.
To build SmolLM-Instruct, we finetuned the base models on publicly available datasets.
https://huggingface.co/HuggingFaceTB/SmolLM-1.7B-Instruct-v0.2
1/4 Reproducing research results in ML is hard: no code, vague descriptions, noisy results.A lot of effort
@huggingface
goes into making new methods available for the community, thus we wrote a blog with the challenges and strategies on the example of
@GoogleAI
’s Infini-Attention
2/4 We attempted to reproduce Infini-Attention and found it generates content related to earlier segments, but it isn’t good enough to recall the needle in the haystack. We also faced convergence issues and wanted to share how we debugged them.
Link: http://huggingface.co/blog/infini-attention
@huggingface
goes into making new methods available for the community, thus we wrote a blog with the challenges and strategies on the example of
@GoogleAI
’s Infini-Attention
2/4 We attempted to reproduce Infini-Attention and found it generates content related to earlier segments, but it isn’t good enough to recall the needle in the haystack. We also faced convergence issues and wanted to share how we debugged them.
Link: http://huggingface.co/blog/infini-attention
SmolLM Instruct v0.2 - 135M, 360M and 1.7B parameter instruction tuned Small LMs, Apache 2.0 licensed. Closing the gap to bring intelligence closer to thought (<500 ms per generation)! 🔥
The models are optimised to run on-device with WebGPU support (from MLC and ONNX Runtime) and llama.cpp.
Run them on your Mac, browser, GPU, CPU - it works blazingly fast.
We provide already converted/ quantised - GGUFs, MLC and ONNX checkpoints 🐐
What's new?
We train SmolLM base models on a new synthetic dataset of 2,000 simple everyday conversations we generated by llama3.1-70B -> everyday-conversations-llama3.1-2k
and existing datasets like Magpie-Pro-300K-Filtered by
@argilla_io
, self-oss-instruct-sc2-exec-filter-50k, and a small subset of OpenHermes-2.5 from
@NousResearch
Bonus: We release the fine-tuning scripts we used to train these checkpoints, so that you can fine-tune them for your own use-cases too. ⚡️
Enjoy! and looking forward to what you build with these https://huggingface.co/collections/HuggingFaceTB/local-smollms-66c0f3b2a15b4eed7fb198d0
The models are optimised to run on-device with WebGPU support (from MLC and ONNX Runtime) and llama.cpp.
Run them on your Mac, browser, GPU, CPU - it works blazingly fast.
We provide already converted/ quantised - GGUFs, MLC and ONNX checkpoints 🐐
What's new?
We train SmolLM base models on a new synthetic dataset of 2,000 simple everyday conversations we generated by llama3.1-70B -> everyday-conversations-llama3.1-2k
and existing datasets like Magpie-Pro-300K-Filtered by
@argilla_io
, self-oss-instruct-sc2-exec-filter-50k, and a small subset of OpenHermes-2.5 from
@NousResearch
Bonus: We release the fine-tuning scripts we used to train these checkpoints, so that you can fine-tune them for your own use-cases too. ⚡️
Enjoy! and looking forward to what you build with these https://huggingface.co/collections/HuggingFaceTB/local-smollms-66c0f3b2a15b4eed7fb198d0
How good are you at spotting AI-generated images?
Find out by playing Fake Insects 🐞 a Game where you need to identify which insects are fake (AI generated). Good luck & share your best score in the comments!
victor/fake-insectshttps://cdn-uploads.huggingface.co/production/uploads/5f17f0a0925b9863e28ad517/Gn3CqEf83euvbSF1089W5.png
Find out by playing Fake Insects 🐞 a Game where you need to identify which insects are fake (AI generated). Good luck & share your best score in the comments!
victor/fake-insectshttps://cdn-uploads.huggingface.co/production/uploads/5f17f0a0925b9863e28ad517/Gn3CqEf83euvbSF1089W5.png
Brief backstory: Before diving into AI, I spent over a decade working in ecological fields such as the conservation corps, biodynamic farming, and natural habitat restoration. This background instilled in me a deep concern about the environmental impact of scaling AI without sustainable practices.
Driven by this concern, I've spent months planning and experimenting to make my AI work more eco-friendly. I'm thrilled to announce that I've successfully transitioned my entire operation to run on 100% sustainable solar power!
My current setup includes multiple linked Mac Pro tower desktops and custom code built from open-source libraries. While it's a bit experimental, this configuration is working great for my needs. All my LLM research, development, and client services now run exclusively on solar energy.
I'm curious if anyone else here has experimented with renewable energy for their LLM work?
For those interested in more details, I've written a brief blog post about this journey here https://medium.com/@betalabsllm/powering-the-future-be-ta-labs-revolutionary-100-solar-powered-ai-operation-444433e61d43
Driven by this concern, I've spent months planning and experimenting to make my AI work more eco-friendly. I'm thrilled to announce that I've successfully transitioned my entire operation to run on 100% sustainable solar power!
My current setup includes multiple linked Mac Pro tower desktops and custom code built from open-source libraries. While it's a bit experimental, this configuration is working great for my needs. All my LLM research, development, and client services now run exclusively on solar energy.
I'm curious if anyone else here has experimented with renewable energy for their LLM work?
For those interested in more details, I've written a brief blog post about this journey here https://medium.com/@betalabsllm/powering-the-future-be-ta-labs-revolutionary-100-solar-powered-ai-operation-444433e61d43
Ghost 8B Beta (1608), a top-performing language model with unmatched multilingual support and cost efficiency.
Key Highlights:
- Superior Performance: Outperforms Llama 3.1 8B Instruct, GPT-3.5 Turbo, Claude 3 Opus, GPT-4, and more in winrate scores.
- Expanded Language Support: Now supports 16 languages, including English, Vietnamese, Spanish, Chinese, and more.
- Enhanced Capabilities: Improved math, reasoning, and instruction-following for better task handling.
With two context options (8k and 128k), Ghost 8B Beta is perfect for complex, multilingual applications, balancing power and cost-effectiveness.
🔗 Learn More: https://ghost-x.org/docs/models/ghost-8b-beta
ghost-x/ghost-8b-beta-668ead6179f93be717db4542
Key Highlights:
- Superior Performance: Outperforms Llama 3.1 8B Instruct, GPT-3.5 Turbo, Claude 3 Opus, GPT-4, and more in winrate scores.
- Expanded Language Support: Now supports 16 languages, including English, Vietnamese, Spanish, Chinese, and more.
- Enhanced Capabilities: Improved math, reasoning, and instruction-following for better task handling.
With two context options (8k and 128k), Ghost 8B Beta is perfect for complex, multilingual applications, balancing power and cost-effectiveness.
🔗 Learn More: https://ghost-x.org/docs/models/ghost-8b-beta
ghost-x/ghost-8b-beta-668ead6179f93be717db4542
Bensbites
AI won’t take your job,
when you know how to use it.
Simple, bite-sized education designed to boost your AI knowledge—from beginner to pro.
Join 100k+ subscribers staying up to date on AI with our daily digest.
https://bensbites.com/?via=aidevin
AI won’t take your job,
when you know how to use it.
Simple, bite-sized education designed to boost your AI knowledge—from beginner to pro.
Join 100k+ subscribers staying up to date on AI with our daily digest.
https://bensbites.com/?via=aidevin
I made my first Flux Lora style on Civitai : r/StableDiffusion
New Makima flux lora | image created by WhiteZ | Tensor.Art
flux_makima, woman, collared shirt, white shirt, black necktie, black pants, red hair, single braid, in the office holding a sign with the text: "tensor art", smiling evilly , pixiv
flux_makima, woman, collared shirt, white shirt, black necktie, black pants, red hair, single braid, in the office holding a sign with the text: "tensor art", smiling evilly , pixiv