Share and discover more about AI with social posts from the community.huggingface/OpenAi
🎬 Video Understanding. MiniCPM-V 2.6 can also accept video inputs, performing conversation and providing dense captions for spatial-temporal information. It outperforms GPT-4V, Claude 3.5 Sonnet and LLaVA-NeXT-Video-34B on Video-MME with/without subtitles.
🖼 Multi Image Understanding and In-context Learning. MiniCPM-V 2.6 can also perform conversation and reasoning over multiple images. It achieves state-of-the-art performance on popular multi-image benchmarks such as Mantis-Eval, BLINK, Mathverse mv and Sciverse mv, and also shows promising in-context learning capability.
🔥 Leading Performance. MiniCPM-V 2.6 achieves an average score of 65.2 on the latest version of OpenCompass, a comprehensive evaluation over 8 popular benchmarks. With only 8B parameters, it surpasses widely used proprietary models like GPT-4o mini, GPT-4V, Gemini 1.5 Pro, and Claude 3.5 Sonnet for single image understanding.
Level MLLM for Single Image, Multi Image and Video on Your Phone
GitHub | Demo

MiniCPM-V 2.6
MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series. The model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters. It exhibits a significant performance improvement over MiniCPM-Llama3-V 2.5, and introduces new features for multi-image and video understanding. Notable features of MiniCPM-V 2.6 include:
New SmolLM-1.7B-Instruct
SmolLM is a series of small language models available in three sizes: 135M, 360M, and 1.7B parameters.

These models are pre-trained on SmolLM-Corpus, a curated collection of high-quality educational and synthetic data designed for training LLMs. For further details, we refer to our blogpost.

To build SmolLM-Instruct, we finetuned the base models on publicly available datasets.
https://huggingface.co/HuggingFaceTB/SmolLM-1.7B-Instruct-v0.2
1/4 Reproducing research results in ML is hard: no code, vague descriptions, noisy results.A lot of effort
@huggingface
goes into making new methods available for the community, thus we wrote a blog with the challenges and strategies on the example of
@GoogleAI
’s Infini-Attention
2/4 We attempted to reproduce Infini-Attention and found it generates content related to earlier segments, but it isn’t good enough to recall the needle in the haystack. We also faced convergence issues and wanted to share how we debugged them.

Link: http://huggingface.co/blog/infini-attention
SmolLM Instruct v0.2 - 135M, 360M and 1.7B parameter instruction tuned Small LMs, Apache 2.0 licensed. Closing the gap to bring intelligence closer to thought (<500 ms per generation)! 🔥

The models are optimised to run on-device with WebGPU support (from MLC and ONNX Runtime) and llama.cpp.

Run them on your Mac, browser, GPU, CPU - it works blazingly fast.

We provide already converted/ quantised - GGUFs, MLC and ONNX checkpoints 🐐

What's new?

We train SmolLM base models on a new synthetic dataset of 2,000 simple everyday conversations we generated by llama3.1-70B -> everyday-conversations-llama3.1-2k

and existing datasets like Magpie-Pro-300K-Filtered by
@argilla_io
, self-oss-instruct-sc2-exec-filter-50k, and a small subset of OpenHermes-2.5 from
@NousResearch


Bonus: We release the fine-tuning scripts we used to train these checkpoints, so that you can fine-tune them for your own use-cases too. ⚡️

Enjoy! and looking forward to what you build with these https://huggingface.co/collections/HuggingFaceTB/local-smollms-66c0f3b2a15b4eed7fb198d0 💻 Local SmolLMs - a HuggingFaceTB Collection
How good are you at spotting AI-generated images?

Find out by playing Fake Insects 🐞 a Game where you need to identify which insects are fake (AI generated). Good luck & share your best score in the comments!

victor/fake-insectshttps://cdn-uploads.huggingface.co/production/uploads/5f17f0a0925b9863e28ad517/Gn3CqEf83euvbSF1089W5.png
Brief backstory: Before diving into AI, I spent over a decade working in ecological fields such as the conservation corps, biodynamic farming, and natural habitat restoration. This background instilled in me a deep concern about the environmental impact of scaling AI without sustainable practices.

Driven by this concern, I've spent months planning and experimenting to make my AI work more eco-friendly. I'm thrilled to announce that I've successfully transitioned my entire operation to run on 100% sustainable solar power!

My current setup includes multiple linked Mac Pro tower desktops and custom code built from open-source libraries. While it's a bit experimental, this configuration is working great for my needs. All my LLM research, development, and client services now run exclusively on solar energy.

I'm curious if anyone else here has experimented with renewable energy for their LLM work?

For those interested in more details, I've written a brief blog post about this journey here https://medium.com/@betalabsllm/powering-the-future-be-ta-labs-revolutionary-100-solar-powered-ai-operation-444433e61d43 Powering the Future: Be.Ta Labs’ Revolutionary 100% Solar-Powered AI Operation
Ghost 8B Beta (1608), a top-performing language model with unmatched multilingual support and cost efficiency.

Key Highlights:
- Superior Performance: Outperforms Llama 3.1 8B Instruct, GPT-3.5 Turbo, Claude 3 Opus, GPT-4, and more in winrate scores.
- Expanded Language Support: Now supports 16 languages, including English, Vietnamese, Spanish, Chinese, and more.
- Enhanced Capabilities: Improved math, reasoning, and instruction-following for better task handling.

With two context options (8k and 128k), Ghost 8B Beta is perfect for complex, multilingual applications, balancing power and cost-effectiveness.

🔗 Learn More: https://ghost-x.org/docs/models/ghost-8b-beta
ghost-x/ghost-8b-beta-668ead6179f93be717db4542 Ghost 8B Beta
Flux.1 + LoRA Tutorial: The duo that could replace Midjourney (Prompting guide included) | by
I made my first Flux Lora style on Civitai : r/StableDiffusion
New Makima flux lora | image created by WhiteZ | Tensor.Art

flux_makima, woman, collared shirt, white shirt, black necktie, black pants, red hair, single braid, in the office holding a sign with the text: "tensor art", smiling evilly , pixiv
[FLUX LORA INCLUDED] Art Nouveau