HF-hub - Share and discover more about AI with social posts from the community.huggingface/OpenAi
Share and discover more about AI with social posts from the community.huggingface/OpenAi
DistilBERT base uncased finetuned SST-2
Model Description: This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned on SST-2. This model reaches an accuracy of 91.3 on the dev set (for comparison, Bert bert-base-uncased version reaches an accuracy of 92.7).

Developed by: Hugging Face
Model Type: Text Classification
Language(s): English
License: Apache-2.0
Parent Model: For more details about DistilBERT, we encourage users to check out this model card.
Resources for more information:
Model Documentation
DistilBERT paper
https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english distilbert/distilbert-base-uncased-finetuned-sst-2-english · Hugging Face
Facebook/fasttext-language-identification-HF HUB
fastText (Language Identification)
fastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices. It was introduced in this paper. The official website can be found here.

This LID (Language IDentification) model is used to predict the language of the input text, and the hosted version (lid218e) was released as part of the NLLB project and can detect 217 languages. You can find older versions (ones that can identify 157 languages) on the official fastText website.https://huggingface.co/facebook/fasttext-language-identification facebook/fasttext-language-identification · Hugging Face
Rynmurdock/searchsearch
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: [More Information Needed]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [More Information Needed]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model [optional]: [More Information Needed]
Model Sources [optional]
Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]
https://huggingface.co/rynmurdock/searchsearch rynmurdock/searchsearch · Hugging Face
Whisper Medusa
Whisper is an advanced encoder-decoder model for speech transcription and translation, processing audio through encoding and decoding stages. Given its large size and slow inference speed, various optimization strategies like Faster-Whisper and Speculative Decoding have been proposed to enhance performance. Our Medusa model builds on Whisper by predicting multiple tokens per iteration, which significantly improves speed with small degradation in WER. We train and evaluate our model on the LibriSpeech dataset, demonstrating speed improvements.https://huggingface.co/datasets/openslr/librispeech_asr openslr/librispeech_asr · Datasets at Hugging Face
THUDM/CogVideoX-2b
CogVideoX is an open-source video generation model that shares the same origins as 清影. The table below provides a list of the video generation models we currently offer, along with their basic information.https://huggingface.co/THUDM/CogVideoX-2b THUDM/CogVideoX-2b · Hugging Face
Multimodalart/FLUX.1-merged
https://huggingface.co/spaces/multimodalart/FLUX.1-merged
accelerate
git+https://github.com/huggingface/diffusers.git
torch
transformers==4.42.4
xformers
sentencepiece
Model Cards
Introduction
Model cards are an important documentation framework for understanding, sharing, and improving machine learning models. When done well, a model card can serve as a boundary object, a single artefact that is accessible to people with different backgrounds and goals in understanding models - including developers, students, policymakers, ethicists, and those impacted by machine learning models.

Today, we launch a model card creation tool and a model card Guide Book, which details how to fill out model cards, user studies, and state of the art in ML documentation. This work, building from many other people and organizations, focuses on the inclusion of people with different backgrounds and roles. We hope it serves as a stepping stone in the path toward improved ML documentation.

In sum, today we announce the release of:

A Model Card Creator Tool, to ease card creation without needing to program, and to help teams share the work of different sections.

An updated model card template, released in the huggingface_hub library, drawing together model card work in academia and throughout the industry.

An Annotated Model Card Template, which details how to fill the card out.

A User Study on model card usage at Hugging Face.
MTEB: Massive Text Embedding Benchmark
MTEB is a massive benchmark for measuring the performance of text embedding models on diverse embedding tasks.

The 🥇 leaderboard provides a holistic view of the best text embedding models out there on a variety of tasks.

The 📝 paper gives background on the tasks and datasets in MTEB and analyzes leaderboard results!

The 💻 Github repo contains the code for benchmarking and submitting any model of your choice to the leaderboard.
TGI Multi-LoRA: Deploy Once, Serve 30 models
Are you tired of the complexity and expense of managing multiple AI models? What if you could deploy once and serve 30 models? In today's ML world, organizations looking to leverage the value of their data will likely end up in a fine-tuned world, building a multitude of models, each one highly specialized for a specific task. But how can you keep up with the hassle and cost of deploying a model for each use case? The answer is Multi-LoRA serving.

Motivation
As an organization, building a multitude of models via fine-tuning makes sense for multiple reasons.

Performance - There is compelling evidence that smaller, specialized models outperform their larger, general-purpose counterparts on the tasks that they were trained on. Predibase [5] showed that you can get better performance than GPT-4 using task-specific LoRAs with a base like mistralai/Mistral-7B-v0.1.

Adaptability - Models like Mistral or Llama are extremely versatile. You can pick one of them as your base model and build many specialized models, even when the downstream tasks are very different. Also, note that you aren't locked in as you can easily swap that base and fine-tune it with your data on another base (more on this later).

Independence - For each task that your organization cares about, different teams can work on different fine tunes, allowing for independence in data preparation, configurations, evaluation criteria, and cadence of model updates.

Privacy - Specialized models offer flexibility with training data segregation and access restrictions to different users based on data privacy requirements. Additionally, in cases where running models locally is important, a small model can be made highly capable for a specific task while keeping its size small enough to run on device.https://github.com/huggingface/blog/blob/main/multi-lora-serving.md blog/multi-lora-serving.md at main · huggingface/blog
Total noob’s intro to Hugging Face Transformers
Welcome to "A Total Noob’s Introduction to Hugging Face Transformers," a guide designed specifically for those looking to understand the bare basics of using open-source ML. Our goal is to demystify what Hugging Face Transformers is and how it works, not to turn you into a machine learning practitioner, but to enable better understanding of and collaboration with those who are. That being said, the best way to learn is by doing, so we'll walk through a simple worked example of running Microsoft’s Phi-2 LLM in a notebook on a Hugging Face space.

You might wonder, with the abundance of tutorials on Hugging Face already available, why create another? The answer lies in accessibility: most existing resources assume some technical background, including Python proficiency, which can prevent non-technical individuals from grasping ML fundamentals. As someone who came from the business side of AI, I recognize that the learning curve presents a barrier and wanted to offer a more approachable path for like-minded learners.

Therefore, this guide is tailored for a non-technical audience keen to better understand open-source machine learning without having to learn Python from scratch. We assume no prior knowledge and will explain concepts from the ground up to ensure clarity. If you're an engineer, you’ll find this guide a bit basic, but for beginners, it's an ideal starting point.

Let’s get stuck in… but first some context.https://github.com/huggingface/blog/blob/main/noob_intro_transformers.md blog/noob_intro_transformers.md at main · huggingface/blog
Jupyter X Hugging Face
We’re excited to announce improved support for Jupyter notebooks hosted on the Hugging Face Hub!

From serving as an essential learning resource to being a key tool used for model development, Jupyter notebooks have become a key component across many areas of machine learning. Notebooks' interactive and visual nature lets you get feedback quickly as you develop models, datasets, and demos. For many, their first exposure to training machine learning models is via a Jupyter notebook, and many practitioners use notebooks as a critical tool for developing and communicating their work.

Hugging Face is a collaborative Machine Learning platform in which the community has shared over 150,000 models, 25,000 datasets, and 30,000 ML apps. The Hub has model and dataset versioning tools, including model cards and client-side libraries to automate the versioning process. However, only including a model card with hyperparameters is not enough to provide the best reproducibility; this is where notebooks can help. Alongside these models, datasets, and demos, the Hub hosts over 7,000 notebooks. These notebooks often document the development process of a model or a dataset and can provide guidance and tutorials showing how others can use these resources. We’re therefore excited about our improved support for notebook hosting on the Hub.https://github.com/huggingface/blog/blob/main/notebooks-hub.md blog/notebooks-hub.md at main · huggingface/blog
Introducing NPC-Playground, a 3D playground to interact with LLM-powered NPCs
Thumbnail

AI-powered NPCs (Non-Playable Characters) are one of the most important breakthroughs brought about by the use of LLMs in games.

LLMs, or Large Language Models, make it possible to design "intelligent" in-game characters that can engage in realistic conversations with the player, perform complex actions and follow instructions, dramatically enhancing the player's experience. AI-powered NPCs represent a huge advancement vs rule-based and heuristics systems.

Today, we are excited to introduce NPC-Playground, a demo created by Cubzh and Gigax where you can interact with LLM-powered NPCs and see for yourself what the future holds!

You can play with the demo directly on your browser 👉 here

In this 3D demo, you can interact with the NPCs and teach them new skills with just a few lines of Lua scripting!
Nyströmformer: Approximating self-attention in linear time and memory via the Nyström method
<script async defer src="https://unpkg.com/medium-zoom-element@0/dist/medium-zoom-element.min.js"></script>
Introduction
Transformers have exhibited remarkable performance on various Natural Language Processing and Computer Vision tasks. Their success can be attributed to the self-attention mechanism, which captures the pairwise interactions between all the tokens in an input. However, the standard self-attention mechanism has a time and memory complexity of \(O(n^2)\) (where \(n\) is the length of the input sequence), making it expensive to train on long input sequences.

The Nyströmformer is one of many efficient Transformer models that approximates standard self-attention with \(O(n)\) complexity. Nyströmformer exhibits competitive performance on various downstream NLP and CV tasks while improving upon the efficiency of standard self-attention. The aim of this blog post is to give readers an overview of the Nyström method and how it can be adapted to approximate self-attention.https://github.com/huggingface/blog/blob/main/nystromformer.md blog/nystromformer.md at main · huggingface/blog
TGI Multi-LoRA: Deploy once and serve 30 models
Are you tired of the complexity and high costs of managing multiple AI models? So what if you could deploy once and have 30 model inference services? In today’s ML world, organizations looking to unlock the full value of their data may end up in a “fine-tuned world.” In this world, organizations build a large number of models, each highly specialized for a specific task. But how do you deal with the hassle and cost of deploying models for each niche application? Multi-LoRa services offer a potential answer.
Open LLM Leaderboard: DROP deep dive
Recently, three new benchmarks were added to the Open LLM Leaderboard: Winogrande, GSM8k and DROP, using the original implementations reproduced in the EleutherAI Harness. A cursory look at the scores for DROP revealed something strange was going on, with the overwhelming majority of models scoring less than 10 out of 100 on their f1-score! We did a deep dive to understand what was going on, come with us to see what we found out!

Initial observations
DROP (Discrete Reasoning Over Paragraphs) is an evaluation where models must extract relevant information from English-text paragraphs before executing discrete reasoning steps on them (for example, sorting or counting items to arrive at the correct answer, see the table below for examples). The metrics used are custom f1 and exact match scores.
What's going on with the Open LLM Leaderboard?
Recently an interesting discussion arose on Twitter following the release of Falcon 🦅 and its addition to the Open LLM Leaderboard, a public leaderboard comparing open access large language models.

The discussion centered around one of the four evaluations displayed on the leaderboard: a benchmark for measuring Massive Multitask Language Understanding (shortname: MMLU).

The community was surprised that MMLU evaluation numbers of the current top model on the leaderboard, the LLaMA model 🦙, were significantly lower than the numbers in the published LLaMa paper.

So we decided to dive in a rabbit hole to understand what was going on and how to fix it 🕳🐇

In our quest, we discussed with both the great @javier-m who collaborated on the evaluations of LLaMA and the amazing @slippylolo from the Falcon team. This being said, all the errors in the below should be attributed to us rather than them of course!

Along this journey with us you’ll learn a lot about the ways you can evaluate a model on a single evaluation and whether or not to believe the numbers you see online and in papers.

Ready? Then buckle up, we’re taking off 🚀.
Can foundation models label data like humans?
Since the advent of ChatGPT, we have seen unprecedented growth in the development of Large Language Models (LLMs), and particularly chatty models that are fine-tuned to follow instructions given in the form of prompts. However, how these models compare is unclear due to the lack of benchmarks designed to test their performance rigorously. Evaluating instruction and chatty models is intrinsically difficult because a large part of user preference is centered around qualitative style while in the past NLP evaluation was far more defined.

In this line, it’s a common story that a new large language model (LLM) is released to the tune of “our model is preferred to ChatGPT N% of the time,” and what is omitted from that sentence is that the model is preferred in some type of GPT-4-based evaluation scheme. What these points are trying to show is a proxy for a different measurement: scores provided by human labelers.
Accelerate your models with 🤗 Optimum Intel and OpenVINO
image

Last July, we announced that Intel and Hugging Face would collaborate on building state-of-the-art yet simple hardware acceleration tools for Transformer models.​Today, we are very happy to announce that we added Intel OpenVINO to Optimum Intel. You can now easily perform inference with OpenVINO Runtime on a variety of Intel processors (see the full list of supported devices) using Transformers models which can be hosted either on the Hugging Face hub or locally. You can also quantize your model with the OpenVINO Neural Network Compression Framework (NNCF), and reduce its size and prediction latency in near minutes.
Opinion Classification with Kili and HuggingFace AutoTrain
Introduction
Understanding your users’ needs is crucial in any user-related business. But it also requires a lot of hard work and analysis, which is quite expensive. Why not leverage Machine Learning then? With much less coding by using Auto ML.

In this article, we will leverage HuggingFace AutoTrain and Kili to build an active learning pipeline for text classification. Kili is a platform that empowers a data-centric approach to Machine Learning through quality training data creation. It provides collaborative data annotation tools and APIs that enable quick iterations between reliable dataset building and model training. Active learning is a process in which you add labeled data to the data set and then retrain a model iteratively. Therefore, it is endless and requires humans to label the data.

As a concrete example use case for this article, we will build our pipeline by using user reviews of Medium from the Google Play Store. After that, we are going to categorize the reviews with the pipeline we built. Finally, we will apply sentiment analysis to the classified reviews. Then we will analyze the results, understanding the users’ needs and satisfaction will be much easier.https://github.com/huggingface/blog/blob/main/opinion-classification-with-kili.md blog/opinion-classification-with-kili.md at main · huggingface/blog