HF-hub

08:53 · Aug 12, 2024 · Mon

RT-DETR - probably the best combination of speed, accuracy, and license for real-time object detection.

I just released a blog post tutorial on fine-tuning RT-DETR on a custom dataset.

shoutout to
@NielsRogge
for all the help!

link: https://blog.roboflow.com/train-rt-detr-custom-dataset-transformers/

↓ key takeawayshttps://pbs.twimg.com/card_img/1821575792448491520/AfYz_BzY?format=jpg&name=small

Roboflow Blog

How to Train RT-DETR on a Custom Dataset with Transformers

Learn how to train RT-DETR on a custom dataset using the Transformers library.

08:52 · Aug 12, 2024 · Mon

Can we train a VLM to 𝐩𝐫𝐞𝐟𝐞𝐫?

This is now possible, thanks to the new TRL/DPO support for VLMs! 🎉

As an example, we've trained a model to reduce hallucinations.

Check out:
📰 Blog post: https://huggingface.co/blog/dpo_vlm
🐙 TRL: https://github.com/huggingface/trl

Thanks to
@mervenoyann
,
@vwxyzjn
and
@krasul
who helped me with this work!

08:47 · Aug 12, 2024 · Mon

Agent for self-correcting Text-to-SQL 🧑‍💻

What if the query generated by your Text-to-SQL pipeline is correct SQL but returns wrong results?
👉 We need to add a critique step

✅ That's very simple with an agent!
Check out the notebook!https://huggingface.co/learn/cookbook/agent_text_to_sql

huggingface.co

Agent for text-to-SQL with automatic error correction - Hugging Face Open-Source AI Cookbook

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

08:46 · Aug 12, 2024 · Mon

The vision language model in this video is 0.5B and can take in image, video and 3D! 🤯
Llava-NeXT-Interleave is a new vision language model trained on interleaved image, video and 3D data

keep reading ⥥⥥

08:45 · Aug 12, 2024 · Mon

Mistral 7B running on Mac, powered by CoreML! ⚡️

Heavily optimised with the latest updates from WWDC like stateful buffers, ML Tensors and 4-bit palletisation!

Try it today with swift-transformers and chat-ui! 🔥

08:45 · Aug 12, 2024 · Mon

Llama 405B is here, and it comes with more than expected! 🚨
@AIatMeta
Llama 3.1 comes in 3 sizes, 8B, 70B, and 405B, and speaks 8 languages! 🌍 Llama 3.1 405B matches or beats the Openai GPT-4o across many text benchmarks.

New and improvements of 3.1✨:
🧮 8B, 70B & 405B

08:45 · Aug 12, 2024 · Mon

Meta Llama 3.1 405B, 70B & 8B are here - Multilingual & with 128K context & Tool-use + agents! Competitive/ beats GPT4o & Claude Sonnet 3.5 unequivocally the best open LLM out there!🐐

Bonus: It comes with a more permissive license, which allows one to train other LLMs on its high-quality outputs 🔥

08:44 · Aug 12, 2024 · Mon

Llama-405b runs on cpu.
Getting 1.67 token/s output,
10 tokens/words per second input without a gpu.
Slow but usable, summarizing a 2 hour long medtech discussion with it. Will upload 2bit optimized etc here
https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/tree/main

08:43 · Aug 12, 2024 · Mon

With larger and larger diffusion transformers coming up, it's becoming increasingly important to have some good quantization tools for them.

We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.

We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.

08:43 · Aug 12, 2024 · Mon

Really nice development by
@nvidia
and
@HuggingFace

Launch of Hugging Face Inference-as-a-Service powered by NVIDIA NIM, a new service on the Hugging Face Hub

So, we can use open models with the accelerated compute platform, of NVIDIA DGX Cloud for inference serving.

Code is fully compatible with OpenAI API, allowing you to use the openai’ sdk for inference.

Note: You need access to an Organization with a Hugging Face Enterprise subscription to run Inference.

------

📌 So NVIDIA NIMs is an inference microservices that provide models as optimized containers — to deploy on clouds, data centers or workstations, giving them the ability to easily build generative AI applications for copilots, chatbots and more, in minutes rather than weeks.

📌 Maximizes infrastructure investments and compute efficiency. For example, running Meta Llama 3-8B in a NIM produces up to 3x more generative AI tokens on accelerated infrastructure than without NIM.

08:42 · Aug 12, 2024 · Mon

Google just dropped Gemma 2 2B! 🔥

> Scores higher than GPT 3.5, Mixtral 8x7B on the LYMSYS arena
> MMLU: 56.1 & MBPP: 36.6
> Beats previous (Gemma 1 2B) by more than 10% in benchmarks
> 2.6B parameters, Multilingual
> 2 Trillion tokens (training set)
> Distilled from Gemma 2 27B (?)
> Trained on 512 TPU v5e

Smaller models beat orders of magnitude bigger models! 🤗
Very cool direction and so many cool ablations for distillation, too!

Kudos to Google & Deepmind for continuing their belief in open source and science! ⚡️

08:42 · Aug 12, 2024 · Mon

Gemma 2 2B running in a browser, powered by WebLLM & WebGPU! 🔥

100% local & on-device

In less than 24 hours, we've already got the model to the edge! ⚡️

Try it out on an HF space below:

08:41 · Aug 12, 2024 · Mon

NEW ARENA: Text to Speech Arena for Japanese by
@kotoba_tech
🔥

🔉Sound on

Outside of English, TTS evaluation is quite scarce. The Arena, allows one to test open source models against the closed source giants.

In the leaderboard you can compare open models like Bark, MOE-VITS, Kotoba Speech with closed source models like Google TTS, Open AI TTS and so on.

If you're a Japanese speaker then go check it out and help us find the best Japanese TTS model out there! 👀

08:35 · Aug 12, 2024 · Mon

Hugging News #114 :ilovepython: @everyone
Gemma
:google: Google releases Gemma 2 2B , ShieldGemma and Gemma Scope
:google: Gemma 2 2B in your browser thanks to MLC, 100% local and super fast with WebLLM + WebGPU!
:google: Gemma 2 2B running in a free Google Colab , powered by Transformers !
:google: Simple instructions to get started with the latest Gemma 2 models + llama.cpp !

08:34 · Aug 12, 2024 · Mon

Llama 3.1 405B released. 🎏 MagPie-Ultra is the first open dataset using Llama 3.1 405B-Instruct FP8 to generate 50,000 synthetic instruction pairs using the MagPie recipe and
@argilla_io
distilabel. It includes challenging instructions for coding math, data analysis, creative writing, advice seeking, or Brainstorming. ⚗️

MagPie datasets are created by prompting LLMs with "empty" prompts that consist only of starting special tokens, allowing the model to auto-regressively generate user queries and corresponding responses, which are then filtered to select high-quality data. 👨‍🎓

Note: The dataset is unfiltered but includes quality & difficulty scores, embeddings, topics, and safety scores from ArmorRM and LlamaGuard. 🛡

⚗️ Pipeline: https://huggingface.co/datasets/argilla/magpie-ultra-v0.1/blob/main/pipeline.py
🤗 Dataset: https://huggingface.co/datasets/argilla/magpie-ultra-v0.1

08:34 · Aug 12, 2024 · Mon

Dropping magpie-ultra-v0.1, the first open synthetic dataset built with Llama 3.1 405B.

Created with distilabel, it's our most advanced and compute-intensive pipeline to date.

https://huggingface.co/datasets/argilla/magpie-ultra-v0.1

Let's dig into the details!

08:33 · Aug 12, 2024 · Mon

Today we release our first foundation model. OCRonos-Vintage is a 124 million parameters model pretrained end-to-end by
@pleiasfr
on 18 billion tokens of cultural heritage archives, with nearly SOTA results for OCR correction in English

08:33 · Aug 12, 2024 · Mon

SF3D from Stability claims to state-of-the-art in mesh reconstruction.

let's see if it's true

⚔️ Added to 3D Arena https://huggingface.co/spaces/dylanebert/3d-arena

08:32 · Aug 12, 2024 · Mon

ROAM Challenge 2: LATAM Out-of-Distribution Few-shot Challenge
Develop models that can classify unusual or specific vehicle types from minimal training data, a crucial skill in environments with unique vehicular regulations
https://huggingface.co/spaces/Artificio/ROAM2FewShotChallenge