Share and discover more about AI with social posts from the community.huggingface/OpenAi
Build an agent with tool-calling superpowers 🦸 using Transformers Agents
Authored by: Aymeric Roucher
This notebook demonstrates how you can use Transformers Agents to build awesome agents!
What are agents? Agents are systems that are powered by an LLM and enable the LLM (with careful prompting and output parsing) to use specific tools to solve problems.
These tools are basically functions that the LLM couldn’t perform well by itself: for instance for a text-generation LLM like Llama-3-70B, this could be an image generation tool, a web search tool, a calculator…
What is Transformers Agents? it’s an extension of our transformers library that provides building blocks to build your own agents! Learn more about it in the documentation.https://huggingface.co/learn/cookbook/agents
Authored by: Aymeric Roucher
This notebook demonstrates how you can use Transformers Agents to build awesome agents!
What are agents? Agents are systems that are powered by an LLM and enable the LLM (with careful prompting and output parsing) to use specific tools to solve problems.
These tools are basically functions that the LLM couldn’t perform well by itself: for instance for a text-generation LLM like Llama-3-70B, this could be an image generation tool, a web search tool, a calculator…
What is Transformers Agents? it’s an extension of our transformers library that provides building blocks to build your own agents! Learn more about it in the documentation.https://huggingface.co/learn/cookbook/agents
RT-DETR - probably the best combination of speed, accuracy, and license for real-time object detection.
I just released a blog post tutorial on fine-tuning RT-DETR on a custom dataset.
shoutout to
@NielsRogge
for all the help!
link: https://blog.roboflow.com/train-rt-detr-custom-dataset-transformers/
↓ key takeawayshttps://pbs.twimg.com/card_img/1821575792448491520/AfYz_BzY?format=jpg&name=small
I just released a blog post tutorial on fine-tuning RT-DETR on a custom dataset.
shoutout to
@NielsRogge
for all the help!
link: https://blog.roboflow.com/train-rt-detr-custom-dataset-transformers/
↓ key takeawayshttps://pbs.twimg.com/card_img/1821575792448491520/AfYz_BzY?format=jpg&name=small
Can we train a VLM to 𝐩𝐫𝐞𝐟𝐞𝐫?
This is now possible, thanks to the new TRL/DPO support for VLMs! 🎉
As an example, we've trained a model to reduce hallucinations.
Check out:
📰 Blog post: https://huggingface.co/blog/dpo_vlm
🐙 TRL: https://github.com/huggingface/trl
Thanks to
@mervenoyann
,
@vwxyzjn
and
@krasul
who helped me with this work!
This is now possible, thanks to the new TRL/DPO support for VLMs! 🎉
As an example, we've trained a model to reduce hallucinations.
Check out:
📰 Blog post: https://huggingface.co/blog/dpo_vlm
🐙 TRL: https://github.com/huggingface/trl
Thanks to
@mervenoyann
,
@vwxyzjn
and
@krasul
who helped me with this work!
Agent for self-correcting Text-to-SQL 🧑💻
What if the query generated by your Text-to-SQL pipeline is correct SQL but returns wrong results?
👉 We need to add a critique step
✅ That's very simple with an agent!
Check out the notebook!https://huggingface.co/learn/cookbook/agent_text_to_sql
What if the query generated by your Text-to-SQL pipeline is correct SQL but returns wrong results?
👉 We need to add a critique step
✅ That's very simple with an agent!
Check out the notebook!https://huggingface.co/learn/cookbook/agent_text_to_sql
The vision language model in this video is 0.5B and can take in image, video and 3D! 🤯
Llava-NeXT-Interleave is a new vision language model trained on interleaved image, video and 3D data
keep reading ⥥⥥
Llava-NeXT-Interleave is a new vision language model trained on interleaved image, video and 3D data
keep reading ⥥⥥
Mistral 7B running on Mac, powered by CoreML! ⚡️
Heavily optimised with the latest updates from WWDC like stateful buffers, ML Tensors and 4-bit palletisation!
Try it today with swift-transformers and chat-ui! 🔥
Heavily optimised with the latest updates from WWDC like stateful buffers, ML Tensors and 4-bit palletisation!
Try it today with swift-transformers and chat-ui! 🔥
Llama 405B is here, and it comes with more than expected! 🚨
@AIatMeta
Llama 3.1 comes in 3 sizes, 8B, 70B, and 405B, and speaks 8 languages! 🌍 Llama 3.1 405B matches or beats the Openai GPT-4o across many text benchmarks.
New and improvements of 3.1✨:
🧮 8B, 70B & 405B
@AIatMeta
Llama 3.1 comes in 3 sizes, 8B, 70B, and 405B, and speaks 8 languages! 🌍 Llama 3.1 405B matches or beats the Openai GPT-4o across many text benchmarks.
New and improvements of 3.1✨:
🧮 8B, 70B & 405B
Meta Llama 3.1 405B, 70B & 8B are here - Multilingual & with 128K context & Tool-use + agents! Competitive/ beats GPT4o & Claude Sonnet 3.5 unequivocally the best open LLM out there!🐐
Bonus: It comes with a more permissive license, which allows one to train other LLMs on its high-quality outputs 🔥
Bonus: It comes with a more permissive license, which allows one to train other LLMs on its high-quality outputs 🔥
Llama-405b runs on cpu.
Getting 1.67 token/s output,
10 tokens/words per second input without a gpu.
Slow but usable, summarizing a 2 hour long medtech discussion with it. Will upload 2bit optimized etc here
https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/tree/main
Getting 1.67 token/s output,
10 tokens/words per second input without a gpu.
Slow but usable, summarizing a 2 hour long medtech discussion with it. Will upload 2bit optimized etc here
https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/tree/main
Really nice development by
@nvidia
and
@HuggingFace
Launch of Hugging Face Inference-as-a-Service powered by NVIDIA NIM, a new service on the Hugging Face Hub
So, we can use open models with the accelerated compute platform, of NVIDIA DGX Cloud for inference serving.
Code is fully compatible with OpenAI API, allowing you to use the openai’ sdk for inference.
Note: You need access to an Organization with a Hugging Face Enterprise subscription to run Inference.
------
📌 So NVIDIA NIMs is an inference microservices that provide models as optimized containers — to deploy on clouds, data centers or workstations, giving them the ability to easily build generative AI applications for copilots, chatbots and more, in minutes rather than weeks.
📌 Maximizes infrastructure investments and compute efficiency. For example, running Meta Llama 3-8B in a NIM produces up to 3x more generative AI tokens on accelerated infrastructure than without NIM.
@nvidia
and
@HuggingFace
Launch of Hugging Face Inference-as-a-Service powered by NVIDIA NIM, a new service on the Hugging Face Hub
So, we can use open models with the accelerated compute platform, of NVIDIA DGX Cloud for inference serving.
Code is fully compatible with OpenAI API, allowing you to use the openai’ sdk for inference.
Note: You need access to an Organization with a Hugging Face Enterprise subscription to run Inference.
------
📌 So NVIDIA NIMs is an inference microservices that provide models as optimized containers — to deploy on clouds, data centers or workstations, giving them the ability to easily build generative AI applications for copilots, chatbots and more, in minutes rather than weeks.
📌 Maximizes infrastructure investments and compute efficiency. For example, running Meta Llama 3-8B in a NIM produces up to 3x more generative AI tokens on accelerated infrastructure than without NIM.
Google just dropped Gemma 2 2B! 🔥
> Scores higher than GPT 3.5, Mixtral 8x7B on the LYMSYS arena
> MMLU: 56.1 & MBPP: 36.6
> Beats previous (Gemma 1 2B) by more than 10% in benchmarks
> 2.6B parameters, Multilingual
> 2 Trillion tokens (training set)
> Distilled from Gemma 2 27B (?)
> Trained on 512 TPU v5e
Smaller models beat orders of magnitude bigger models! 🤗
Very cool direction and so many cool ablations for distillation, too!
Kudos to Google & Deepmind for continuing their belief in open source and science! ⚡️
> Scores higher than GPT 3.5, Mixtral 8x7B on the LYMSYS arena
> MMLU: 56.1 & MBPP: 36.6
> Beats previous (Gemma 1 2B) by more than 10% in benchmarks
> 2.6B parameters, Multilingual
> 2 Trillion tokens (training set)
> Distilled from Gemma 2 27B (?)
> Trained on 512 TPU v5e
Smaller models beat orders of magnitude bigger models! 🤗
Very cool direction and so many cool ablations for distillation, too!
Kudos to Google & Deepmind for continuing their belief in open source and science! ⚡️
Gemma 2 2B running in a browser, powered by WebLLM & WebGPU! 🔥
100% local & on-device
In less than 24 hours, we've already got the model to the edge! ⚡️
Try it out on an HF space below:
100% local & on-device
In less than 24 hours, we've already got the model to the edge! ⚡️
Try it out on an HF space below:
NEW ARENA: Text to Speech Arena for Japanese by
@kotoba_tech
🔥
🔉Sound on
Outside of English, TTS evaluation is quite scarce. The Arena, allows one to test open source models against the closed source giants.
In the leaderboard you can compare open models like Bark, MOE-VITS, Kotoba Speech with closed source models like Google TTS, Open AI TTS and so on.
If you're a Japanese speaker then go check it out and help us find the best Japanese TTS model out there! 👀
@kotoba_tech
🔥
🔉Sound on
Outside of English, TTS evaluation is quite scarce. The Arena, allows one to test open source models against the closed source giants.
In the leaderboard you can compare open models like Bark, MOE-VITS, Kotoba Speech with closed source models like Google TTS, Open AI TTS and so on.
If you're a Japanese speaker then go check it out and help us find the best Japanese TTS model out there! 👀
Hugging News #114 :ilovepython: @everyone
Gemma
:google: Google releases Gemma 2 2B , ShieldGemma and Gemma Scope
:google: Gemma 2 2B in your browser thanks to MLC, 100% local and super fast with WebLLM + WebGPU!
:google: Gemma 2 2B running in a free Google Colab , powered by Transformers !
:google: Simple instructions to get started with the latest Gemma 2 models + llama.cpp !
Gemma
:google: Google releases Gemma 2 2B , ShieldGemma and Gemma Scope
:google: Gemma 2 2B in your browser thanks to MLC, 100% local and super fast with WebLLM + WebGPU!
:google: Gemma 2 2B running in a free Google Colab , powered by Transformers !
:google: Simple instructions to get started with the latest Gemma 2 models + llama.cpp !
Llama 3.1 405B released. 🎏 MagPie-Ultra is the first open dataset using Llama 3.1 405B-Instruct FP8 to generate 50,000 synthetic instruction pairs using the MagPie recipe and
@argilla_io
distilabel. It includes challenging instructions for coding math, data analysis, creative writing, advice seeking, or Brainstorming. ⚗️
MagPie datasets are created by prompting LLMs with "empty" prompts that consist only of starting special tokens, allowing the model to auto-regressively generate user queries and corresponding responses, which are then filtered to select high-quality data. 👨🎓
Note: The dataset is unfiltered but includes quality & difficulty scores, embeddings, topics, and safety scores from ArmorRM and LlamaGuard. 🛡
⚗️ Pipeline: https://huggingface.co/datasets/argilla/magpie-ultra-v0.1/blob/main/pipeline.py
🤗 Dataset: https://huggingface.co/datasets/argilla/magpie-ultra-v0.1
@argilla_io
distilabel. It includes challenging instructions for coding math, data analysis, creative writing, advice seeking, or Brainstorming. ⚗️
MagPie datasets are created by prompting LLMs with "empty" prompts that consist only of starting special tokens, allowing the model to auto-regressively generate user queries and corresponding responses, which are then filtered to select high-quality data. 👨🎓
Note: The dataset is unfiltered but includes quality & difficulty scores, embeddings, topics, and safety scores from ArmorRM and LlamaGuard. 🛡
⚗️ Pipeline: https://huggingface.co/datasets/argilla/magpie-ultra-v0.1/blob/main/pipeline.py
🤗 Dataset: https://huggingface.co/datasets/argilla/magpie-ultra-v0.1
Dropping magpie-ultra-v0.1, the first open synthetic dataset built with Llama 3.1 405B.
Created with distilabel, it's our most advanced and compute-intensive pipeline to date.
https://huggingface.co/datasets/argilla/magpie-ultra-v0.1
Let's dig into the details!
Created with distilabel, it's our most advanced and compute-intensive pipeline to date.
https://huggingface.co/datasets/argilla/magpie-ultra-v0.1
Let's dig into the details!
Today we release our first foundation model. OCRonos-Vintage is a 124 million parameters model pretrained end-to-end by
@pleiasfr
on 18 billion tokens of cultural heritage archives, with nearly SOTA results for OCR correction in English
@pleiasfr
on 18 billion tokens of cultural heritage archives, with nearly SOTA results for OCR correction in English
SF3D from Stability claims to state-of-the-art in mesh reconstruction.
let's see if it's true
⚔️ Added to 3D Arena https://huggingface.co/spaces/dylanebert/3d-arena
let's see if it's true
⚔️ Added to 3D Arena https://huggingface.co/spaces/dylanebert/3d-arena