HF-hub - Share and discover more about AI with social posts from the community.huggingface/OpenAi
Share and discover more about AI with social posts from the community.huggingface/OpenAi
Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker
Open on Github
In case you missed it: on March 25th we announced a collaboration with Amazon SageMaker to make it easier to create State-of-the-Art Machine Learning models, and ship cutting-edge NLP features faster.

Together with the SageMaker team, we built 🤗 Transformers optimized Deep Learning Containers to accelerate training of Transformers-based models. Thanks AWS friends!🤗 🚀

With the new HuggingFace estimator in the SageMaker Python SDK, you can start training with a single line of code.
Introducing the Hugging Face Embedding Container for Amazon SageMaker
We are excited to announce that the new Hugging Face Embedding Container for Amazon SageMaker is now generally available (GA). AWS customers can now efficiently deploy embedding models on SageMaker to build Generative AI applications, including Retrieval-Augmented Generation (RAG) applications.

In this Blog we will show you how to deploy open Embedding Models, like Snowflake/snowflake-arctic-embed-l, BAAI/bge-large-en-v1.5 or sentence-transformers/all-MiniLM-L6-v2 to Amazon SageMaker for inference using the new Hugging Face Embedding Container. We will deploy the Snowflake/snowflake-arctic-embed-m-v1.5 one of the best open Embedding Models for retrieval - you can check its rankings on the MTEB Leaderboard.

The example covers:

1. Setup development environment
2. Retrieve the new Hugging Face Embedding Container
3. Deploy Snowflake Arctic to Amazon SageMaker
4. Run and evaluate Inference performance
5. Delete model and endpoint
https://github.com/huggingface/blog/blob/main/sagemaker-huggingface-embedding.md blog/sagemaker-huggingface-embedding.md at main · huggingface/blog
Introducing the Hugging Face LLM Inference Container for Amazon SageMaker
This is an example on how to deploy the open-source LLMs, like BLOOM to Amazon SageMaker for inference using the new Hugging Face LLM Inference Container. We will deploy the 12B Pythia Open Assistant Model, an open-source Chat LLM trained with the Open Assistant dataset.

The example covers:

Setup development environment
Retrieve the new Hugging Face LLM DLC
Deploy Open Assistant 12B to Amazon SageMaker
Run inference and chat with our model
Create Gradio Chatbot backed by Amazon SageMaker
You can find the code for the example also in the notebooks repository.https://github.com/huggingface/blog/blob/main/sagemaker-huggingface-llm.md blog/sagemaker-huggingface-llm.md at main · huggingface/blog
Machine Learning Experts - Sasha Luccioni
🤗 Welcome to Machine Learning Experts - Sasha Luccioni
🚀 If you're interested in learning how ML Experts, like Sasha, can help accelerate your ML roadmap visit: hf.co/support.

Hey friends! Welcome to Machine Learning Experts. I'm your host, Britney Muller and today’s guest is Sasha Luccioni. Sasha is a Research Scientist at Hugging Face where she works on the ethical and societal impacts of Machine Learning models and datasets.

Sasha is also a co-chair of the Carbon Footprint WG of the Big Science Workshop, on the Board of WiML, and a founding member of the Climate Change AI (CCAI) organization which catalyzes impactful work applying machine learning to the climate crisis.https://github.com/huggingface/blog/blob/main/sasha-luccioni-interview.md Expert Support – Hugging Face
Welcome Stable-baselines3 to the Hugging Face Hub 🤗
At Hugging Face, we are contributing to the ecosystem for Deep Reinforcement Learning researchers and enthusiasts. That’s why we’re happy to announce that we integrated Stable-Baselines3 to the Hugging Face Hub.

Stable-Baselines3 is one of the most popular PyTorch Deep Reinforcement Learning library that makes it easy to train and test your agents in a variety of environments (Gym, Atari, MuJoco, Procgen...). With this integration, you can now host your saved models 💾 and load powerful models from the community.

In this article, we’re going to show how you can do it.
StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation
Instruction tuning is an approach of fine-tuning that gives large language models (LLMs) the capability to follow natural and human-written instructions. However, for programming tasks, most models are tuned on either human-written instructions (which are very expensive) or instructions generated by huge and proprietary LLMs (which may not be permitted). We introduce StarCoder2-15B-Instruct-v0.1, the very first entirely self-aligned code LLM trained with a fully permissive and transparent pipeline. Our open-source pipeline uses StarCoder2-15B to generate thousands of instruction-response pairs, which are then used to fine-tune StarCoder-15B itself without any human annotations or distilled data from huge and proprietary LLMs.
Interactively explore your Huggingface dataset with one line of code
The Hugging Face datasets library not only provides access to more than 70k publicly available datasets, but also offers very convenient data preparation pipelines for custom datasets.

Renumics Spotlight allows you to create interactive visualizations to identify critical clusters in your data. Because Spotlight understands the data semantics within Hugging Face datasets, you can get started with just one line of code:
Diffusers welcomes Stable Diffusion 3
Stable Diffusion 3 (SD3), Stability AI’s latest iteration of the Stable Diffusion family of models, is now available on the Hugging Face Hub and can be used with 🧨 Diffusers.

The model released today is Stable Diffusion 3 Medium, with 2B parameters.

As part of this release, we have provided:

Models on the Hub
Diffusers Integration
SD3 Dreambooth and LoRA training scripts
Open-sourcing Knowledge Distillation Code and Weights of SD-Small and SD-Tiny
In recent times, the AI community has witnessed a remarkable surge in the development of larger and more performant language models, such as Falcon 40B, LLaMa-2 70B, Falcon 40B, MPT 30B, and in the imaging domain with models like SD2.1 and SDXL. These advancements have undoubtedly pushed the boundaries of what AI can achieve, enabling highly versatile and state-of-the-art image generation and language understanding capabilities. However, as we marvel at the power and complexity of these models, it is essential to recognize a growing need to make AI models smaller, efficient, and more accessible, particularly by open-sourcing them.
Accelerating Stable Diffusion XL Inference with JAX on Cloud TPU v5e
Generative AI models, such as Stable Diffusion XL (SDXL), enable the creation of high-quality, realistic content with wide-ranging applications. However, harnessing the power of such models presents significant challenges and computational costs. SDXL is a large image generation model whose UNet component is about three times as large as the one in the previous version of the model. Deploying a model like this in production is challenging due to the increased memory requirements, as well as increased inference times. Today, we are thrilled to announce that Hugging Face Diffusers now supports serving SDXL using JAX on Cloud TPUs, enabling high-performance, cost-efficient inference.

Google Cloud TPUs are custom-designed AI accelerators, which are optimized for training and inference of large AI models, including state-of-the-art LLMs and generative AI models such as SDXL. The new Cloud TPU v5e is purpose-built to bring the cost-efficiency and performance required for large-scale AI training and inference. At less than half the cost of TPU v4, TPU v5e makes it possible for more organizations to train and deploy AI models.

🧨 Diffusers JAX integration offers a convenient way to run SDXL on TPU via XLA, and we built a demo to showcase it. You can try it out in this Space or in the playground embedded below:

<script type="module" src="https://gradio.s3-us-west-2.amazonaws.com/3.45.1/gradio.js"> </script>
Under the hood, this demo runs on several TPU v5e-4 instances (each instance has 4 TPU chips) and takes advantage of parallelization to serve four large 1024×1024 images in about 4 seconds. This time includes format conversions, communications time, and frontend processing; the actual generation time is about 2.3s, as we'll see below!
LoRA training scripts of the world, unite!
A community derived guide to some of the SOTA practices for SD-XL Dreambooth LoRA fine tuning

TL;DR

We combined the Pivotal Tuning technique used on Replicate's SDXL Cog trainer with the Prodigy optimizer used in the Kohya trainer (plus a bunch of other optimizations) to achieve very good results on training Dreambooth LoRAs for SDXL. Check out the training script on diffusers🧨. Try it out on Colab.

If you want to skip the technical talk, you can use all the techniques in this blog and train on Hugging Face Spaces with a simple UI and curated parameters (that you can meddle with).
Accelerating SD Turbo and SDXL Turbo Inference with ONNX Runtime and Olive
Introduction
SD Turbo and SDXL Turbo are two fast generative text-to-image models capable of generating viable images in as little as one step, a significant improvement over the 30+ steps often required with previous Stable Diffusion models. SD Turbo is a distilled version of Stable Diffusion 2.1, and SDXL Turbo is a distilled version of SDXL 1.0. We’ve previously shown how to accelerate Stable Diffusion inference with ONNX Runtime. Not only does ONNX Runtime provide performance benefits when used with SD Turbo and SDXL Turbo, but it also makes the models accessible in languages other than Python, like C# and Java.
Supercharged Searching on the Hugging Face Hub
Open In Colab
The huggingface_hub library is a lightweight interface that provides a programmatic approach to exploring the hosting endpoints Hugging Face provides: models, datasets, and Spaces.

Up until now, searching on the Hub through this interface was tricky to pull off, and there were many aspects of it a user had to "just know" and get accustomed to.

In this article, we will be looking at a few exciting new features added to huggingface_hub to help lower that bar and provide users with a friendly API to search for the models and datasets they want to use without leaving their Jupyter or Python interfaces.

Before we begin, if you do not have the latest version of the huggingface_hub library on your system, please run the following cell:

!pip install huggingface_hub -U
SegMoE: Segmind Mixture of Diffusion Experts
SegMoE is an exciting framework for creating Mixture-of-Experts Diffusion models from scratch! SegMoE is comprehensively integrated within the Hugging Face ecosystem and comes supported with diffusers 🔥!

Among the features and integrations being released today:

Models on the Hub, with their model cards and licenses (Apache 2.0)
Github Repository to create your own MoE-style models.https://github.com/huggingface/blog/blob/main/segmoe.md blog/segmoe.md at main · huggingface/blog
How Sempre Health is leveraging the Expert Acceleration Program to accelerate their ML roadmap
👋 Hello, friends! We recently sat down with Swaraj Banerjee and Larry Zhang from Sempre Health, a startup that brings behavior-based, dynamic pricing to Healthcare. They are doing some exciting work with machine learning and are leveraging our Expert Acceleration Program to accelerate their ML roadmap.

An example of our collaboration is their new NLP pipeline to automatically classify and respond inbound messages. Since deploying it to production, they have seen more than 20% of incoming messages get automatically handled by this new system 🤯 having a massive impact on their business scalability and team workflow.

In this short video, Swaraj and Larry walk us through some of their machine learning work and share their experience collaborating with our team via the Expert Acceleration Program. Check it out:

<iframe width="100%" style="aspect-ratio: 16 / 9;"src="https://www.youtube.com/embed/QBOTlNJUtdk" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
If you'd like to accelerate your machine learning roadmap with the help of our experts, as Swaraj and Larry did, visit hf.co/support to learn more about our Expert Acceleration Program and request a quote.
Sentence Transformers in the Hugging Face Hub
Over the past few weeks, we've built collaborations with many Open Source frameworks in the machine learning ecosystem. One that gets us particularly excited is Sentence Transformers.

Sentence Transformers is a framework for sentence, paragraph and image embeddings. This allows to derive semantically meaningful embeddings (1) which is useful for applications such as semantic search or multi-lingual zero shot classification. As part of Sentence Transformers v2 release, there are a lot of cool new features:

Sharing your models in the Hub easily.
Widgets and Inference API for sentence embeddings and sentence similarity.
Better sentence-embeddings models available (benchmark and models in the Hub).
Sentiment Analysis on Encrypted Data with Homomorphic Encryption
It is well-known that a sentiment analysis model determines whether a text is positive, negative, or neutral. However, this process typically requires access to unencrypted text, which can pose privacy concerns.

Homomorphic encryption is a type of encryption that allows for computation on encrypted data without needing to decrypt it first. This makes it well-suited for applications where user's personal and potentially sensitive data is at risk (e.g. sentiment analysis of private messages).

This blog post uses the Concrete-ML library, allowing data scientists to use machine learning models in fully homomorphic encryption (FHE) settings without any prior knowledge of cryptography. We provide a practical tutorial on how to use the library to build a sentiment analysis model on encrypted data.

The post covers:

transformers
how to use transformers with XGBoost to perform sentiment analysis
how to do the training
how to use Concrete-ML to turn predictions into predictions over encrypted data
how to deploy to the cloud using a client/server protocol
Last but not least, we’ll finish with a complete demo over Hugging Face Spaces to show this functionality in action.https://github.com/huggingface/blog/blob/main/sentiment-analysis-fhe.md blog/sentiment-analysis-fhe.md at main · huggingface/blog
Getting Started with Sentiment Analysis using Python
<script async defer src="https://unpkg.com/medium-zoom-element@0/dist/medium-zoom-element.min.js"></script>
Sentiment analysis is the automated process of tagging data according to their sentiment, such as positive, negative and neutral. Sentiment analysis allows companies to analyze data at scale, detect insights and automate processes.

In the past, sentiment analysis used to be limited to researchers, machine learning engineers or data scientists with experience in natural language processing. However, the AI community has built awesome tools to democratize access to machine learning in recent years. Nowadays, you can use sentiment analysis with a few lines of code and no machine learning experience at all! 🤯

In this guide, you'll learn everything to get started with sentiment analysis using Python, including:

What is sentiment analysis?
How to use pre-trained sentiment analysis models with Python
How to build your own sentiment analysis model
How to analyze tweets with sentiment analysis
Let's get started! 🚀https://github.com/huggingface/blog/blob/main/sentiment-analysis-python.md blog/sentiment-analysis-python.md at main · huggingface/blog
Getting Started with Sentiment Analysis on Twitter
<script async defer src="https://unpkg.com/medium-zoom-element@0/dist/medium-zoom-element.min.js"></script>
Sentiment analysis is the automatic process of classifying text data according to their polarity, such as positive, negative and neutral. Companies leverage sentiment analysis of tweets to get a sense of how customers are talking about their products and services, get insights to drive business decisions, and identify product issues and potential PR crises early on.

In this guide, we will cover everything you need to learn to get started with sentiment analysis on Twitter. We'll share a step-by-step process to do sentiment analysis, for both, coders and non-coders. If you are a coder, you'll learn how to use the Inference API, a plug & play machine learning API for doing sentiment analysis of tweets at scale in just a few lines of code. If you don't know how to code, don't worry! We'll also cover how to do sentiment analysis with Zapier, a no-code tool that will enable you to gather tweets, analyze them with the Inference API, and finally send the results to Google Sheets ⚡️

Read along or jump to the section that sparks 🌟 your interest:

What is sentiment analysis?
How to do Twitter sentiment analysis with code?
How to do Twitter sentiment analysis without coding?
Buckle up and enjoy the ride! 🤗https://github.com/huggingface/blog/blob/main/sentiment-analysis-twitter.md blog/sentiment-analysis-twitter.md at main · huggingface/blog
We Raised $100 Million for Open & Collaborative Machine Learning 🚀
Today we have some exciting news to share! Hugging Face has raised $100 Million in Series C funding 🔥🔥🔥 led by Lux Capital with major participations from Sequoia, Coatue and support of existing investors Addition, a_capital, SV Angel, Betaworks, AIX Ventures, Kevin Durant, Rich Kleiman from Thirty Five Ventures, Olivier Pomel (co-founder & CEO at Datadog) and more.

Series C

We've come a long way since we first open sourced PyTorch BERT in 2018 and are just getting started! 🙌

Machine learning is becoming the default way to build technology. When you think about your average day, machine learning is everywhere: from your Zoom background, to searching on Google, to ordering an Uber or writing an email with auto-complete --it's all machine learning.

Hugging Face is now the fastest growing community & most used platform for machine learning! With 100,000 pre-trained models & 10,000 datasets hosted on the platform for NLP,