Share and discover more about AI with social posts from the community.huggingface/OpenAi
Almost ready: search for a Hugging Face dataset on the Hub from information in the datasets viewer preview!

Soon, you can find deep-cut datasets even if they don't have a full dataset card (you should still document your datasets!)

You can help improve this project by rating synthetic user search queries for hub datasets.

If you have a Hub login, you can start annotating in Argilla
in < 5 seconds here: https://davanstrien-my-argilla.hf.space/dataset/1100a091-7f3f-4a6e-ad51-4e859abab58f/annotation-mode

I need to do some tidying, but I'll share all the code and in-progress datasets for this soon!https://cdn-uploads.huggingface.co/production/uploads/60107b385ac3e86b3ea4fc34/4ZfsdtnD8ay-WnkqxjpjX.png
Reflection Llama 3.1 70B (Correct Weights) on ZeroGPU thanks to llama.cpp and unsloth (for quantization)

ZeroGPU space
-
gokaygokay/Reflection-70B-llamacpp


- Working Model
mattshumer/ref_70_e3


- Quantized Models
unsloth/Reflection-Llama-3.1-70B-GGUF
Interested in learning about everything Image?

​With the rise of recent interest in Vision Language Models (VLMs), we decided to make a push to include an ImageField within Argilla! This means any open source developer can now work on better models for vision ML tasks too and we would like to show you how.

​We would love to introduce this new feature to you, so we've prepared a set of notebooks to go over some common image scenarios.
finetune an CLIP retrieval model with sentence transformers
use ColPali+ Qwen VL for RAG and log the results to Argilla
image-generation preference: creating multi-modal preference datasets for free using Hugging Face inference endpoints.

​See you on Thursday!

https://lu.ma/x7id1jqu Everything image: from fine-tuning CLIP models to synthetic image datasets · Luma
The New York Times did a fun quiz to test your ability to detect whether a video is AI-generated or real. They put Runway, Kling, and Sora to the test.

I got 10/10 🤓 —how about you?

https://www.nytimes.com/interactive/2024/09/09/technology/ai-video-deepfake-runway-kling-quiz.html

00:12 A.I. Can Now Create Lifelike Videos. Can You Tell What’s Real?
⚖️ 𝐀𝐈 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐢𝐬 𝐂𝐨𝐩𝐲𝐫𝐢𝐠𝐡𝐭 𝐈𝐧𝐟𝐫𝐢𝐧𝐠𝐞𝐦𝐞𝐧𝐭

This bold claim is not my opinion, but it has been made in a recent "report" of a group, whose stance is recognizable in their name. It is roughly translated as "Authors' Rights Initiative". They published a report which was also presented before the EU Parliament according to the LinkedIn post below.

I am not really interested in politics, but as an EU citizen I am of course somewhat interested in a reasonable and practical version of the EU AI Act. Not saying there should not be rules around data and AI, but this report is obviously very biased towards one side.

While I think the report itself does not deserve attention, I post it in the hope that you find more examples, where they did not address the issue adequately. Feel free to add to my LinkedIn posts (where the original authors will see it) or here.

[en] Executive summary: https://urheber.info/media/pages/diskurs/ai-training-is-copyright-infringement/3b900058e6-1725460935/executive-summary_engl_final_29-08-2024.pdf
[de] Full report: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4946214

LinkedIn: https://www.linkedin.com/posts/activity-7238912869268959232-6cFx
Remember when @Google launched MediaPipe in an effort to create efficient on-device pipelines?

They've just unlocked the ability to run 7B+ parameter language models directly in your browser. This is a game-changer for on-device AI!

Yes, they are streaming 8.6 GB model files!

Currently, they have Gemma 2B/7B running, but imagine Dynamic LoRA, multimodal support, quantization, and you never leaving Chrome!

This is a significant technical advancement, especially in Memory Optimization:

- Redesigned the model-loading code to work around WebAssembly's 4 GB memory limit.
- Implemented asynchronous loading of transformer stack layers (28 for Gemma 1.1 7B).
- Reduced peak WebAssembly memory usage to less than 1% of previous requirements.

Cross-Platform Compatibility
- Compiled the C++ codebase to WebAssembly for broad browser support.
- Utilized the WebGPU API for native GPU acceleration in browsers.

Here's why this matters:

1. Privacy: No need to send data to remote servers.
2. Cost-Efficiency: Eliminates server expenses.
3. Offline Capabilities: Use powerful AI without an internet connection.

Blog: https://research.google/blog/unlocking-7b-language-models-in-your-browser-a-deep-dive-with-google-ai-edges-mediapipe/
The Hugging Face Semantic Dataset Search Space is back in action! You can find similar datasets by ID or perform a semantic search of dataset cards.

Give it a try:
librarian-bots/huggingface-datasets-semantic-search
https://huggingface.co/spaces/librarian-bots/huggingface-datasets-semantic-search Semantic Dataset Search - a Hugging Face Space by librarian-bots
Ultimate FLUX LoRA Training Tutorial: Windows and Cloud Deployment

I have done total 104 different LoRA trainings and compared each one of them to find the very best hyper parameters and the workflow for FLUX LoRA training by using Kohya GUI training script.

You can see all the done experiments’ checkpoint names and their repo links in following public post: https://www.patreon.com/posts/110838414

After completing all these FLUX LoRA trainings by using the most VRAM optimal and performant optimizer Adafactor I came up with all of the following ranked ready to use configurations.

You can download all the configurations, all research data, installers and instructions at the following link : https://www.patreon.com/posts/110879657


Tutorials
I also have prepared 2 full tutorials. First tutorial covers how to train and use the best FLUX LoRA locally on your Windows computer : https://youtu.be/nySGu12Y05k

This is the main tutorial that you have to watch without skipping to learn everything. It has total 74 chapters, manually written English captions. It is a perfect resource to become 0 to hero for FLUX LoRA training.

The second tutorial I have prepared is for how to train FLUX LoRA on cloud. This tutorial is super extremely important for several reasons. If you don’t have a powerful GPU, you can rent a very powerful and very cheap GPU on Massed Compute and RunPod. I prefer Massed Compute since it is faster and cheaper with our special coupon SECourses. Another reason is that in this tutorial video, I have fully in details shown how to train on a multiple GPU setup to scale your training speed. Moreover, I have shown how to upload your checkpoints and files ultra fast to Hugging Face for saving and transferring for free. Still watch first above Windows tutorial to be able to follow below cloud tutorial : https://youtu.be/-uhL2nW7Ddw

For upscaling SUPIR used : https://youtu.be/OYxVEvDf284 All The LoRA FLUX Training Experiments I Have Done So Far | SECourses: Tutorials, Guides, Resources, Training, FLUX, MidJourney…
NEW RELEASE!

- MOTH is a generalist chat model, using high quality synthetic data to improve general performance.
- Currently available for Llama 3.1 and Gemma 2, more models to follow in the future.

get the models:
sequelbox/Llama3.1-8B-MOTH https://huggingface.co/sequelbox/Llama3.1-8B-MOTH

sequelbox/gemma-2-9B-MOTHhttps://huggingface.co/sequelbox/gemma-2-9B-MOTH


get the dataset:
sequelbox/Supernova


<3 for everyone to use <3 sequelbox/Llama3.1-8B-MOTH · Hugging Face
The world’s first multilingual ColBERT: Jina ColBERT V2 and its “Russian Doll” technology
In the field of RAG, the multi-vector model ColBERT improves retrieval accuracy by generating independent vectors for each token of the document. But it also brings about a sharp increase in storage requirements, and only supports English, which limits its application scope. To solve these problems, we improved the architecture and training process of ColBERT, especially making breakthroughs in multi-language processing. The latest Jina-ColBERT-v2 supports 89 languages ​​and introduces custom output dimension options, significantly reducing storage requirements and improving the efficiency and accuracy of multi-language retrieval. The core highlights of the new version are performance enhancements: compared with the original ColBERT-v2, the English retrieval performance has improved by 6.5%; compared with the previous generation jina-colbert-v1-en, the performance has also improved by 5.4%. Multi-language support: The new version supports up to 89 languages, covering Arabic, Chinese, English, Japanese, Russian and other languages, and also supports programming languages. The output dimensions can be customized: The new version adopts "Russian doll" representation learning technology (Matryoshka Representation Learning, MRL) and provides 128, 96 and 64-dimensional output vector options, allowing users to choose the appropriate dimensions according to actual needs. The full technical report can be found on arXiv: https://arxiv.org/abs/2408.16672
SemanticFinder now supports WebGPU thanks to @Xenova's efforts with transformers.js v3!
Expect massive performance gains. Inferenced a whole book with 46k chunks in <5min. If your device doesn't support #WebGPU use the classic Wasm-based version:
- WebGPU: https://do-me.github.io/SemanticFinder/webgpu/
- Wasm: https://do-me.github.io/SemanticFinder/

WebGPU harnesses the full power of your hardware, no longer being restricted to just the CPU. The speedup is significant (4-60x) for all kinds of devices: consumer-grade laptops, heavy Nvidia GPU setups or Apple Silicon. Measure the difference for your device here:
Xenova/webgpu-embedding-benchmark

Chrome currently works out of the box, Firefox requires some tweaking.

WebGPU + transformers.js allows to build amazing applications and make them accessible to everyone. E.g. SemanticFinder could become a simple GUI for populating your (vector) DB of choice. See the pre-indexed community texts here:
do-me/SemanticFinder

Happy to hear your ideas!
This is an absolutely mind-boggling experiment!

@GuangyuRobert (Twitter Handle) from MIT has created Project Sid, which simulates over 1,000 autonomous AI agents collaborating in a Minecraft environment, operating for extended periods without human intervention. This simulation demonstrates unprecedented levels of agent interaction, decision-making, and societal development.

Agents operate independently for hours or days, showcasing advanced decision-making algorithms and goal-oriented behavior.

The simulation produced complex, emergent phenomena, including:
- Economic systems with currency (gems) and trading
- Cultural development and religious practices
- Agents even understood bribing. Priests were moving the most gems to bribe people into following them!
- Governmental structures and democratic processes

Project Sid addresses fundamental challenges in AI research:
- Coherence: Maintaining consistent agent behavior over extended periods.
- Multi-agent Collaboration: Enabling effective communication and coordination among numerous AI entities.
- Long-term Progression: Developing agents capable of learning and evolving over time.

While Minecraft serves as the initial testbed, the underlying AI architecture is designed to be game-agnostic, suggesting potential applications in various digital environments and real-world simulations.

Imagine a policy being debated by the government and how it might affect society; Sid can simulate its impact!

Even if this remains just a game experiment, the project successfully manages 1,000+ agents simultaneously, a feat that requires robust distributed computing and efficient agent architecture.

02:35
🌐 Introducing Edupres.ru Presentations Dataset -
nyuuzyou/edupres


Dataset highlights:
- Metadata for 44,210 presentations from edupres.ru
- 21,941 presentations available in original format
- Multilingual content: Primarily Russian, with some Ukrainian, Belarusian, and English
- Each entry includes: URL, title, description, author, publication date, file size, and download link
- Data reflects educational presentations accessible through the Edupres.ru platform
- Licensed under Creative Commons Zero (CC0) for unrestricted use

This dataset offers a unique window into online educational resources, particularly in Russian-language contexts. It provides opportunities for analyzing presentation trends, topic distributions, and language patterns in educational materials. The dataset is particularly well-suited for tasks such as text classification and text retrieval in multilingual educational settings.
An example of the application of LegalKit is the production of knowledge graphs, here is a demo Space 🔗

With the update of the French legal code data model uploaded to 🤗 and the introduction of a column dedicated to HTML text, it's now easy to extract links between different articles and produce complex graphs with just a few lines of Python.

This simplified demo highlights the ease of implementation and creative potential, and enables the generation of complete data sets, although requiring a powerful graphics card for display. The framework used for the moment is D3.js, but perhaps other solutions are possible. I'd be delighted to hear your suggestions, and look forward to hearing from the community.

Link to the 🤗 Space:
louisbrulenaudet/legalkit-knowledge-graph
I'm excited to share my article introducing AISAK's new flagship model, AISAK-O (Artificially Intelligent Swiss Army Knife OPTIMUM). You can read the full details here:

https://huggingface.co/blog/mandelakori/aisak-o

Key highlights of AISAK-O include:

8 billion parameters and a 32k token context length
Multimodal capabilities for processing both text and visual data
Impressive benchmark scores, surpassing GPT-4V in some areas
Specialized in tasks like image captioning, visual reasoning, and cohesive content generation
Efficient architecture competing with larger models
We're also offering a unique beta testing opportunity with access to inference code.

For more information or partnership inquiries, please contact us at [email protected].

I hope you find this advancement in multimodal AI as exciting as we do!
aisak-ai/O Introducing AISAK-O
Reflection Llama 3.1 70B (Correct Weights) on ZeroGPU thanks to llama.cpp and unsloth (for quantization)

ZeroGPU space
-
gokaygokay/Reflection-70B-llamacpp https://huggingface.co/spaces/gokaygokay/Reflection-70B-llamacpp


- Working Model
mattshumer/ref_70_e3


- Quantized Models
unsloth/Reflection-Llama-3.1-70B-GGUF Reflection 70B llama.cpp (Correct Weights) - a Hugging Face Space by gokaygokay
FLUX Gif Generator
Create GIFs with Flux-dev. Based on @fofr's tweet.

For better results include a description of the motion in your prompt
Reflection Llama-3.1 70B
| IMPORTANT UPDATE – There was an issue with the model when we first uploaded it. If you tried it and didn't have good results, please, try again, we think we've fixed the issue.

Reflection Llama-3.1 70B is (currently) the world's top open-source LLM, trained with a new technique called Reflection-Tuning that teaches a LLM to detect mistakes in its reasoning and correct course.

The model was trained on synthetic data generated by Glaive. If you're training a model, Glaive is incredible — use them.

You can try the model here.

Benchmarks
🌟 Argilla v2.1.0 goes multi-modal: Image Field, Dark Mode, Enhanched Hugging Face Hub imports and more!

🖼 Image Field: Seamlessly work with multimodal datasets
🌓 Dark Mode: Reduce eye strain with our sleek new look
🤗 Enhanced Hugging Face Hub import with the SDK
🇪🇸 Spanish UI: Breaking language barriers

Plus more improvements to supercharge your model curation workflow!

Check out the full announcement for details and code examples: https://github.com/argilla-io/argilla/compare/v2.0.1...v2.1.0 Comparing v2.0.1...v2.1.0 · argilla-io/argilla