Share and discover more about AI with social posts from the community.huggingface/OpenAi
Introducing Parler TTS v1 🔉 - 885M (Mini) & 2.2B (Large) - fully open-source Text-to-Speech models! 🤙

> Trained on 45,000 hours of open speech (datasets released as well)
> Upto 4x faster generation thanks to torch compile & static KV cache (compared to previous v0.1 release)
> Mini trained on a larger text encoder, large trained on both larger text & decoder
> Also supports SDPA & Flash Attention 2 for an added speed boost
> In-built streaming, we provide a dedicated streaming class optimised for time to the first audio

> Better speaker consistency, more than a dozen speakers to choose from or create a speaker description prompt and use that

> Not convinced with a speaker? You can fine-tune the model on your dataset (only couple of hours would do)

Apache 2.0 licensed codebase, weights and datasets! 🤗

Can't wait to see what y'all would build with this!🫡
Qwen2-Math, which is based on Qwen2. The flagship model, Qwen2-Math-72B-Instruct, outperforms proprietary models, including GPT-4o and Claude 3.5, in math related downstream tasks!

Feel free to check our blog for more information:
https://qwenlm.github.io/blog/qwen2-math

🤗 HF Collections: https://huggingface.co/collections/Qwen/qwen2-math-66b4c9e072eda65b5ec7534d

🤖 Github: https://github.com/QwenLM/Qwen2-Math
Hugging Face went from 2k repos to 1M repos (models/datasets/demos) in 3 years🔥

🤯It then went to 2M repos in 1 year

As we prepare for dozens or hundreds of millions of open repos, I'm more than excited to welcome the XetHub team to the HF family!🤗

https://forbes.com/sites/richardnieva/2024/08/08/hugging-face-xethub-acquisition/
Qwen2 Audio - 8.5B, Apache 2.0 licensed Audio Language Models 🔥

> SoTA on ASR, S2TT & AIR-Bench ⚡️
> Used 370K hours of Speech, 140K hours of Music and 10K hours of sound for pre-training
> Model excels at voice chat and audio analysis
> Base + Instruct model checkpoints released
> Uses Whisper Encoder paired with Qwen 2 7B LLM backbone
> Trained with Multi-task fine-tuning
> Followed by SFT & DPO
> Model weights on the Hub
> Integrated with Transformers 🤗

Audio LMs is the direction that I'm quite bullish on, it just makes sense to have something that works right out of the box! Kudos Qwen team! ❤️
How to use free models on huggingface.co? 3 easy steps
Using free models from Hugging Face is a great way to leverage state-of-the-art machine learning models for various tasks without the need for extensive setup or training. Here are three easy steps to get you started:

Step 1: Install the Hugging Face transformers Library
First, you need to install the transformers library, which provides easy access to thousands of pre-trained models. You can install it using pip:

bash

pip install transformers
Step 2: Choose a Model from Hugging Face Model Hub
Navigate to the Hugging Face Model Hub and choose a model that suits your needs. For example, you might choose a model like bert-base-uncased for text classification or facebook/wav2vec2-base for speech recognition.

Step 3: Use the Model in Your Code
Once you have chosen a model, you can use it in your Python code. Here’s a simple example of how to use a text classification model:

python

from transformers import pipeline

# Step 1: Choose a model and create a pipeline
classifier = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")

# Step 2: Use the model to make predictions
result = classifier("I love using Hugging Face models!")

# Step 3: Print the result
print(result)
Flux 1 is the best AI image generator right now. From realistic images to perfect text, it can do everything. It has definitely dethroned Midjourney version 6.1.

Here are six places where you can try it, with Replicate being the best 🧵
Want one of these badass World Tour tees? 🌐

Details below 👇
DistilBERT base uncased distilled SQuAD
Table of Contents
-Model Details
-How To Get Started With the Model
-Uses
-Risks, Limitations and Biases
-Training
-Evaluation
-Environmental Impact
-Technical Specifications
-Citation Information
-Model Card Authors
https://huggingface.co/FacebookAI/roberta-large-mnli FacebookAI/roberta-large-mnli · Hugging Face
Text generation with Mistral
The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2.

Mistral-7B-v0.2 has the following changes compared to Mistral-7B-v0.1

32k context window (vs 8k context in v0.1)
Rope-theta = 1e6
No Sliding-Window Attention
For full details of this model please read our paper and release blog post.
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2 mistralai/Mistral-7B-Instruct-v0.2 · Hugging Face
Masked word completion with BERT
BERT base model (uncased)
Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository. This model is uncased: it does not make a difference between english and English.

Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by the Hugging Face team.
https://huggingface.co/google-bert/bert-base-uncased?text=Paris+is+the+%5BMASK%5D+of+France google-bert/bert-base-uncased · Hugging Face
Hugging Face Diffusion Models Course
In this free course, you will:

👩‍🎓 Study the theory behind diffusion models
🧨 Learn how to generate images and audio with the popular 🤗 Diffusers library
🏋️‍♂️ Train your own diffusion models from scratch
📻 Fine-tune existing diffusion models on new datasets
🗺 Explore conditional generation and guidance
🧑‍🔬 Create your own custom diffusion model pipelines
Register via the signup form and then join us on Discord to get the conversations started. Instructions on how to join specific categories/channels are here.
https://github.com/huggingface/diffusion-models-class GitHub - huggingface/diffusion-models-class: Materials for the Hugging Face Diffusion Models Course
Text-to-Image
Generates images from input text. These models can be used to generate and modify images based on text prompts.
About Text-to-Image
Use Cases
Data Generation
Businesses can generate data for their their use cases by inputting text and getting image outputs.

Immersive Conversational Chatbots
Chatbots can be made more immersive if they provide contextual images based on the input provided by the user.

Creative Ideas for Fashion Industry
Different patterns can be generated to obtain unique pieces of fashion. Text-to-image models make creations easier for designers to conceptualize their design before actually implementing it.
https://huggingface.co/tasks/text-to-image What is Text-to-Image? - Hugging Face
PublicPrompts/All-In-One-Pixel-Model

Stable Diffusion model trained using dreambooth to create pixel art, in 2 styles the sprite art can be used with the trigger word "pixelsprite" the scene art can be used with the trigger word "16bitscene"

the art is not pixel perfect, but it can be fixed with pixelating tools like https://pinetools.com/pixelate-effect-image (they also have bulk pixelation)

some example generations
https://huggingface.co/PublicPrompts/All-In-One-Pixel-Model
Fine-tuned DistilRoBERTa-base for Emotion Classification
Model Description
DistilRoBERTa-base is a transformer model that performs sentiment analysis. I fine-tuned the model on transcripts from the Friends show with the goal of classifying emotions from text data, specifically dialogue from Netflix shows or movies. The model predicts 6 Ekman emotions and a neutral class. These emotions include anger, disgust, fear, joy, neutrality, sadness, and surprise.

The model is a fine-tuned version of Emotion English DistilRoBERTa-base and DistilRoBERTa-base. This model was initially trained on the following table from Emotion English DistilRoBERTa-base:
https://huggingface.co/michellejieli/emotion_text_classifier
HeBERT: Pre-trained BERT for Polarity Analysis and Emotion Recognition
HeBERT is a Hebrew pre-trained language model. It is based on Google's BERT architecture and it is BERT-Base config (Devlin et al. 2018).

HeBert was trained on three datasets:

A Hebrew version of OSCAR (Ortiz, 2019): ~9.8 GB of data, including 1 billion words and over 20.8 million sentences.
A Hebrew dump of Wikipedia: ~650 MB of data, including over 63 million words and 3.8 million sentences
Emotion UGC data was collected for the purpose of this study. (described below) We evaluated the model on emotion recognition and sentiment analysis, for downstream tasks.
https://huggingface.co/avichr/heBERT_sentiment_analysis avichr/heBERT_sentiment_analysis · Hugging Face
DistilBERT base uncased finetuned SST-2
Model Description: This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned on SST-2. This model reaches an accuracy of 91.3 on the dev set (for comparison, Bert bert-base-uncased version reaches an accuracy of 92.7).

Developed by: Hugging Face
Model Type: Text Classification
Language(s): English
License: Apache-2.0
Parent Model: For more details about DistilBERT, we encourage users to check out this model card.
Resources for more information:
Model Documentation
DistilBERT paper
https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english distilbert/distilbert-base-uncased-finetuned-sst-2-english · Hugging Face
Facebook/fasttext-language-identification-HF HUB
fastText (Language Identification)
fastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices. It was introduced in this paper. The official website can be found here.

This LID (Language IDentification) model is used to predict the language of the input text, and the hosted version (lid218e) was released as part of the NLLB project and can detect 217 languages. You can find older versions (ones that can identify 157 languages) on the official fastText website.https://huggingface.co/facebook/fasttext-language-identification facebook/fasttext-language-identification · Hugging Face
Rynmurdock/searchsearch
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: [More Information Needed]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [More Information Needed]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model [optional]: [More Information Needed]
Model Sources [optional]
Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]
https://huggingface.co/rynmurdock/searchsearch rynmurdock/searchsearch · Hugging Face
Whisper Medusa
Whisper is an advanced encoder-decoder model for speech transcription and translation, processing audio through encoding and decoding stages. Given its large size and slow inference speed, various optimization strategies like Faster-Whisper and Speculative Decoding have been proposed to enhance performance. Our Medusa model builds on Whisper by predicting multiple tokens per iteration, which significantly improves speed with small degradation in WER. We train and evaluate our model on the LibriSpeech dataset, demonstrating speed improvements.https://huggingface.co/datasets/openslr/librispeech_asr openslr/librispeech_asr · Datasets at Hugging Face
THUDM/CogVideoX-2b
CogVideoX is an open-source video generation model that shares the same origins as 清影. The table below provides a list of the video generation models we currently offer, along with their basic information.https://huggingface.co/THUDM/CogVideoX-2b THUDM/CogVideoX-2b · Hugging Face