HF-hub - Share and discover more about AI with social posts from the community.huggingface/OpenAi
Share and discover more about AI with social posts from the community.huggingface/OpenAi
This is an absolutely mind-boggling experiment!

@GuangyuRobert (Twitter Handle) from MIT has created Project Sid, which simulates over 1,000 autonomous AI agents collaborating in a Minecraft environment, operating for extended periods without human intervention. This simulation demonstrates unprecedented levels of agent interaction, decision-making, and societal development.

Agents operate independently for hours or days, showcasing advanced decision-making algorithms and goal-oriented behavior.

The simulation produced complex, emergent phenomena, including:
- Economic systems with currency (gems) and trading
- Cultural development and religious practices
- Agents even understood bribing. Priests were moving the most gems to bribe people into following them!
- Governmental structures and democratic processes

Project Sid addresses fundamental challenges in AI research:
- Coherence: Maintaining consistent agent behavior over extended periods.
- Multi-agent Collaboration: Enabling effective communication and coordination among numerous AI entities.
- Long-term Progression: Developing agents capable of learning and evolving over time.

While Minecraft serves as the initial testbed, the underlying AI architecture is designed to be game-agnostic, suggesting potential applications in various digital environments and real-world simulations.

Imagine a policy being debated by the government and how it might affect society; Sid can simulate its impact!

Even if this remains just a game experiment, the project successfully manages 1,000+ agents simultaneously, a feat that requires robust distributed computing and efficient agent architecture.

02:35
🌐 Introducing Edupres.ru Presentations Dataset -
nyuuzyou/edupres


Dataset highlights:
- Metadata for 44,210 presentations from edupres.ru
- 21,941 presentations available in original format
- Multilingual content: Primarily Russian, with some Ukrainian, Belarusian, and English
- Each entry includes: URL, title, description, author, publication date, file size, and download link
- Data reflects educational presentations accessible through the Edupres.ru platform
- Licensed under Creative Commons Zero (CC0) for unrestricted use

This dataset offers a unique window into online educational resources, particularly in Russian-language contexts. It provides opportunities for analyzing presentation trends, topic distributions, and language patterns in educational materials. The dataset is particularly well-suited for tasks such as text classification and text retrieval in multilingual educational settings.
An example of the application of LegalKit is the production of knowledge graphs, here is a demo Space 🔗

With the update of the French legal code data model uploaded to 🤗 and the introduction of a column dedicated to HTML text, it's now easy to extract links between different articles and produce complex graphs with just a few lines of Python.

This simplified demo highlights the ease of implementation and creative potential, and enables the generation of complete data sets, although requiring a powerful graphics card for display. The framework used for the moment is D3.js, but perhaps other solutions are possible. I'd be delighted to hear your suggestions, and look forward to hearing from the community.

Link to the 🤗 Space:
louisbrulenaudet/legalkit-knowledge-graph
I'm excited to share my article introducing AISAK's new flagship model, AISAK-O (Artificially Intelligent Swiss Army Knife OPTIMUM). You can read the full details here:

https://huggingface.co/blog/mandelakori/aisak-o

Key highlights of AISAK-O include:

8 billion parameters and a 32k token context length
Multimodal capabilities for processing both text and visual data
Impressive benchmark scores, surpassing GPT-4V in some areas
Specialized in tasks like image captioning, visual reasoning, and cohesive content generation
Efficient architecture competing with larger models
We're also offering a unique beta testing opportunity with access to inference code.

For more information or partnership inquiries, please contact us at [email protected].

I hope you find this advancement in multimodal AI as exciting as we do!
aisak-ai/O Introducing AISAK-O
Reflection Llama 3.1 70B (Correct Weights) on ZeroGPU thanks to llama.cpp and unsloth (for quantization)

ZeroGPU space
-
gokaygokay/Reflection-70B-llamacpp https://huggingface.co/spaces/gokaygokay/Reflection-70B-llamacpp


- Working Model
mattshumer/ref_70_e3


- Quantized Models
unsloth/Reflection-Llama-3.1-70B-GGUF Reflection 70B llama.cpp (Correct Weights) - a Hugging Face Space by gokaygokay
FLUX Gif Generator
Create GIFs with Flux-dev. Based on @fofr's tweet.

For better results include a description of the motion in your prompt
Reflection Llama-3.1 70B
| IMPORTANT UPDATE – There was an issue with the model when we first uploaded it. If you tried it and didn't have good results, please, try again, we think we've fixed the issue.

Reflection Llama-3.1 70B is (currently) the world's top open-source LLM, trained with a new technique called Reflection-Tuning that teaches a LLM to detect mistakes in its reasoning and correct course.

The model was trained on synthetic data generated by Glaive. If you're training a model, Glaive is incredible — use them.

You can try the model here.

Benchmarks
🌟 Argilla v2.1.0 goes multi-modal: Image Field, Dark Mode, Enhanched Hugging Face Hub imports and more!

🖼 Image Field: Seamlessly work with multimodal datasets
🌓 Dark Mode: Reduce eye strain with our sleek new look
🤗 Enhanced Hugging Face Hub import with the SDK
🇪🇸 Spanish UI: Breaking language barriers

Plus more improvements to supercharge your model curation workflow!

Check out the full announcement for details and code examples: https://github.com/argilla-io/argilla/compare/v2.0.1...v2.1.0 Comparing v2.0.1...v2.1.0 · argilla-io/argilla
Wanted to train a FLUX model using out-of-copyright images, so I curated concept art images from NASA.

Model: https://huggingface.co/davanstrien/nasa_concept_art
Dataset:
davanstrien/nasa_concept_art


So far, training was done without captions, but I'm experimenting with using VLLMs to generate captions to see if that improves the model. davanstrien/nasa_concept_art-flux-lora · Hugging Face
💾🧠How much VRAM will you need for training your AI model? 💾🧠
Check out this app where you convert:
Pytorch/tensorflow summary -> required VRAM
or
Parameter count -> required VRAM

Use it in: http://howmuchvram.com

And everything is open source! Ask for new functionalities or contribute in:
https://github.com/AlexBodner/How_Much_VRAM
If it's useful to you leave a star 🌟and share it to someone that will find the tool useful!
More discussion in: https://x.com/AlexBodner_/status/1832054850294812679
Yesterday @mattshumer released
mattshumer/Reflection-Llama-3.1-70B
, an impressive model that achieved incredible results in benchmarks like MMLU. The model was fine-tuned using Reflection-Tuning and the dataset used wasn't released, but I created a small recipe with distilabel that allows generating a dataset with a similar output format:

1. We use MagPie 🐦 in combination with
meta-llama/Meta-Llama-3.1-70B-Instruct
to generate reasoning instructions.
2. We generate a response again using
meta-llama/Meta-Llama-3.1-70B-Instruct
, but we steer the LLM to generate an specific output format using a custom system prompt. In the system prompt, we instruct the LLM that it will have first to think 💭 and have reflections that will help resolving ambiguities. After that, we instruct the LLM to generate an output based on the previous thinking

In this dataset
gabrielmbmb/distilabel-reflection-tuning
you can found 5 rows that I generated with this recipe. You can also found the code of the pipeline in the file called reflection.py.
FLUX Prompt Generator Updates

-
gokaygokay/FLUX-Prompt-Generator


- There are now hundreds of new selections across diverse categories, each offering a lot of choices:

Architecture, Art, Artist, Brands, Character, Cinematic, Fashion, Feelings, Geography, Human, Interaction, Keywords, Objects, People, Photography, Plots, Poses, Scene, Science, Stuff, Time, Typography, Vehicle, Video Game

- In addition to Hugging Face, I've integrated new LLM providers: Groq, OpenAI, and Claude.

- Upgraded Vision Language Models (VLMs): We now feature Qwen2-VL and Florence-2-large.

- New specialized system prompts for various styles and themes, including Happy, Simple, Poster, Only Objects, No Figure, Landscape, Fantasy.https://cdn-uploads.huggingface.co/production/uploads/630899601dd1e3075d975785/u_IZ43q0247UaH2_LK07W.png
Reposting from twitter:

Just so you all know, I'll be on vacation for the following two weeks and away from home! I'm hoping to get on at least once a day to load up some quants, but I won't be as bleeding edge and on the ball :) feel free to shoot me a message if you see one I should make!

In the meantime if you need something bleeding edge make sure to check out @MaziyarPanahi or @bullerwins who both put out great work!
Flux actually has deforum (a "classical" method of generating videos using a text graph model)! ? It feels like a renaissance, and I'm dreaming back to 2022🥹 (By the way, the flux ecosystem is developing really fast! 🤔)

GitHub - XLabs-AI/deforum-x-flux: Deforum based on flux-dev by XLabs-AI

🧐Deforum-x-flux is a project based on flux-dev, mainly used to create high-quality animations and video generation, especially in combination with Stable Diffusion technology for image-to-video conversion. It provides two running modes: CLI and Jupyter Notebook, and supports complex 3D animation modes and interpolation functions.

➡️Link: Web link
Did you see the new coding model from @01-ai ?

collection :
01-ai/yi-coder-66bdb00f5bdd611f9a008f30

demo :
Tonic/Yi-Coder-9B https://huggingface.co/spaces/Tonic/Yi-Coder-9B


achieves SOTA on benchmarks , 125K context window , 55 languages including Docker, Js and many more 🚀 Yi Coder 9B - a Hugging Face Space by Tonic
🌐 Introducing PPT Online Dataset -
nyuuzyou/pptonline


Dataset highlights:
- Metadata for 1,418,349 PowerPoint (.ppt) files from ppt-online.org
- Multilingual content: Russian, Ukrainian, Belarusian, Kazakh, English, and others
- Each entry includes: Unique ID, title, category, download link, file size, and content snippet
- Data reflects presentations accessible through the PPT Online platform
- Licensed under Creative Commons Zero (CC0) for unrestricted use

This dataset offers a unique window into online educational resources, particularly in Eastern European and Central Asian contexts. It provides opportunities for analyzing presentation trends, topic distributions, and language patterns in educational materials.
🚀 𝗪𝗵𝗲𝗿𝗲 𝘀𝗰𝗮𝗹𝗶𝗻𝗴 𝗹𝗮𝘄𝘀 𝗮𝗿𝗲 𝘁𝗮𝗸𝗶𝗻𝗴 𝘂𝘀 : 𝗯𝘆 𝟮𝟬𝟮𝟴, 𝗔𝗜 𝗖𝗹𝘂𝘀𝘁𝗲𝗿𝘀 𝘄𝗶𝗹𝗹 𝗿𝗲𝗮𝗰𝗵 𝘁𝗵𝗲 𝗽𝗼𝘄𝗲𝗿 𝗰𝗼𝗻𝘀𝘂𝗺𝗽𝘁𝗶𝗼𝗻 𝗼𝗳 𝗲𝗻𝘁𝗶𝗿𝗲 𝗰𝗼𝘂𝗻𝘁𝗿𝗶𝗲𝘀

Reminder : “Scaling laws” are empirical laws saying that if you keep multiplying your compute by x10, your models will mechanically keep getting better and better.

To give you an idea, GPT-3 can barely write sentences, and GPT-4, which only used x15 its amount of compute, already sounds much smarter than some of my friends (although it's not really - or at least I haven't tested them side-by side). So you can imagine how far a x100 over GPT-4 can take us.

🏎 As a result, tech titans are racing to build the biggest models, and for this they need gigantic training clusters.

The picture below shows the growth of training compute: it is increasing at a steady exponential rate of a x10 every 2 years. So let’s take this progress a bit further:
- 2022: starting training for GPT-4 : 10^26 FLOPs, cost of $100M
- 2024: today, companies start training on much larger clusters like the “super AI cluster” of Elon Musk’s xAI, 10^27 FLOPS, $1B
- 2026 : by then clusters will require 1GW, i.e. around the full power generated by a nuclear reactor
- 2028: we reach cluster prices in the 100 billion dollars, using 10GW, more than the most powerful power stations currently in use in the US. This last size seems crazy, but Microsoft and OpenAI already are planning one.

Will AI clusters effectively reach these crazy sizes where the consume as much as entire countries?
➡️ Three key ingredients of training might be a roadblock to scaling up :
💸 Money: but it’s very unlikely, given the potential market size for AGI, that investors lose interest.
⚡️ Energy supply at a specific location
📚 Training data: we’re already using 15 trillion tokens for Llama-3.1 when Internet has something like 60 trillion.

🤔 I’d be curious to hear your thoughts: do you think we’ll race all the way there?
How do i access llama 3.1 70b in my space ?

this doesn't seem to work, can someone help me with a working code


from transformers import AutoConfig

config = AutoConfig.from_pretrained("meta-llama/Meta-Llama-3.1-70B", revision="main")
config.rope_scaling = {"type": "llama3", "factor": 8.0}

model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-70B", config=config, use_auth_token=True)
I have put together a notebook on Multimodal RAG, where we do not process the documents with hefty pipelines but natively use:
-
vidore/colpali
for retrieval 📖 it doesn't need indexing with image-text pairs but just images!
-
Qwen/Qwen2-VL-2B-Instruct
for generation 💬 directly feed images as is to a vision language model with no processing to text!
I used ColPali implementation of the new 🐭 Byaldi library by @bclavie 🤗
https://github.com/answerdotai/byaldi
Link to notebook: https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb GitHub - AnswerDotAI/byaldi: Use late-interaction multi-modal models such as ColPali in just a few lines of code.