HF-hub - Share and discover more about AI with social posts from the community.huggingface/OpenAi
Share and discover more about AI with social posts from the community.huggingface/OpenAi
Text-to-3D models take in text input and produce 3D output.
This task is similar to the image-to-3d task, but takes text input instead of image input. In practice, this is often equivalent to a combination of text-to-image and image-to-3d. That is, the text is first converted to an image, then the image is converted to 3D.

Generating Meshes
Meshes are the standard representation of 3D in industry.

Generating Gaussian Splats
Gaussian Splatting is a rendering technique that represents scenes as fuzzy points.https://huggingface.co/tasks/text-to-3d
Zero-shot image classification is the task of classifying previously unseen classes during training of a model.
About the Task
Zero-shot image classification is a computer vision task to classify images into one of several classes, without any prior training or knowledge of the classes.

Zero shot image classification works by transferring knowledge learnt during training of one model, to classify novel classes that was not present in the training data. So this is a variation of transfer learning. For instance, a model trained to differentiate cars from airplanes can be used to classify images of ships.

The data in this learning paradigm consists of

Seen data - images and their corresponding labels
Unseen data - only labels and no images
Auxiliary information - additional information given to the model during training connecting the unseen and seen data. This can be in the form of textual description or word embeddings.
Unconditional image generation is the task of generating images with no condition in any context (like a prompt text or another image). Once trained, the model will create images that resemble its training data distribution.
About Unconditional Image Generation
About the Task
Unconditional image generation is the task of generating new images without any specific input. The main goal of this is to create novel, original images that are not based on existing images. This can be used for a variety of applications, such as creating new artistic images, improving image recognition algorithms, or generating photorealistic images for virtual reality environments.

Unconditional image generation models usually start with a seed that generates a random noise vector. The model will then use this vector to create an output image similar to the images used for training the model.

https://huggingface.co/tasks/unconditional-image-generation
Text-to-video models can be used in any application that requires generating consistent sequence of images from text.

Use Cases
Script-based Video Generation
Text-to-video models can be used to create short-form video content from a provided text script. These models can be used to create engaging and informative marketing videos. For example, a company could use a text-to-video model to create a video that explains how their product works.

Content format conversion
Text-to-video models can be used to generate videos from long-form text, including blog posts, articles, and text files. Text-to-video models can be used to create educational videos that are more engaging and interactive. An example of this is creating a video that explains a complex concept from an article.

Voice-overs and Speech
Text-to-video models can be used to create an AI newscaster to deliver daily news, or for a film-maker to create a short film or a music video.
https://huggingface.co/tasks/text-to-video
Text-to-Image
Generates images from input text. These models can be used to generate and modify images based on text prompts.
Use Cases
Data Generation
Businesses can generate data for their their use cases by inputting text and getting image outputs.

Immersive Conversational Chatbots
Chatbots can be made more immersive if they provide contextual images based on the input provided by the user.

Creative Ideas for Fashion Industry
Different patterns can be generated to obtain unique pieces of fashion. Text-to-image models make creations easier for designers to conceptualize their design before actually implementing it.
https://huggingface.co/tasks/text-to-image
Video classification is the task of assigning a label or class to an entire video. Videos are expected to have only one class for each video. Video classification models take a video as input and return a prediction about which class the video belongs to.

https://huggingface.co/tasks/video-classification
Mask generation is the task of generating masks that identify a specific object or region of interest in a given image. Masks are often used in segmentation tasks, where they provide a precise way to isolate the object of interest for further processing or analysis.

About Mask Generation
Use Cases
Filtering an Image
When filtering for an image, the generated masks might serve as an initial filter to eliminate irrelevant information. For instance, when monitoring vegetation in satellite imaging, mask generation models identify green spots, highlighting the relevant region of the image.
Image to text models output a text from a given image. Image captioning or optical character recognition can be considered as the most common applications of image to text.

About Image-to-Text
Use Cases
Image Captioning
Image Captioning is the process of generating textual description of an image. This can help the visually impaired people to understand what's happening in their surroundings.

Optical Character Recognition (OCR)
OCR models convert the text present in an image, e.g. a scanned document, to text.https://huggingface.co/tasks/image-to-text
Image-to-image is the task of transforming a source image to match the characteristics of a target image or a target image domain. Any image manipulation and enhancement is possible with image to image models.
About Image-to-Image
Use Cases
Style transfer
One of the most popular use cases of image-to-image is style transfer. Style transfer models can convert a normal photography into a painting in the style of a famous painter.https://huggingface.co/tasks/image-to-image
Image feature extraction is the task of extracting features learnt in a computer vision model.
About Image Feature Extraction
Use Cases
Transfer Learning
Models trained on a specific dataset can learn features about the data. For instance, a model trained on a car classification dataset learns to recognize edges and curves on a very high level and car-specific features on a low level. This information can be transferred to a new model that is going to be trained on classifying trucks. This process of extracting features and transferring to another model is called transfer learning.
https://huggingface.co/tasks/image-feature-extraction
Image classification is the task of assigning a label or class to an entire image. Images are expected to have only one class for each image. Image classification models take an image as input and return a prediction about which class the image belongs to.
Use Cases
Image classification models can be used when we are not interested in specific instances of objects with location information or their shape.

Keyword Classification
Image classification models are used widely in stock photography to assign each image a keyword.

Image Search
Models trained in image classification can improve user experience by organizing and categorizing photo galleries on the phone or in the cloud, on multiple keywords or tags.
About Image Classification
https://youtu.be/tjAIM7BOYhw
https://huggingface.co/tasks/image-classification
Depth estimation is the task of predicting depth of the objects present in an image.

About Depth Estimation
Use Cases
Depth estimation models can be used to estimate the depth of different objects present in an image.

Estimation of Volumetric Information
Depth estimation models are widely used to study volumetric formation of objects present inside an image. This is an important use case in the domain of computer graphics.

3D Representation
Depth estimation models can also be used to develop a 3D representation from a 2D image.
https://huggingface.co/tasks/depth-estimation
Zero-shot text classification is a task in natural language processing where a model is trained on a set of labeled examples but is then able to classify new examples from previously unseen classes.
About Zero-Shot Classification
About the Task
Zero Shot Classification is the task of predicting a class that wasn't seen by the model during training. This method, which leverages a pre-trained language model, can be thought of as an instance of transfer learning which generally refers to using a model trained for one task in a different application than what it was originally trained for. This is particularly useful for situations where the amount of labeled data is small.

In zero shot classification, we provide the model with a prompt and a sequence of text that describes what we want our model to do, in natural language. Zero-shot classification excludes any examples of the desired task being completed. This differs from single or few-shot classification, as these tasks include a single or a few examples of the selected task.

Zero, single and few-shot classification seem to be an emergent feature of large language models. This feature seems to come about around model sizes of +100M parameters. The effectiveness of a model at a zero, single or few-shot task seems to scale with model size, meaning that larger models (models with more trainable parameters or layers) generally do better at this task.https://huggingface.co/tasks/zero-shot-classification What is Zero-Shot Classification? - Hugging Face
Translation is the task of converting text from one language to another.

About Translation

https://youtu.be/1JvfrvZgi6c

Use Cases
You can find over a thousand Translation models on the Hub, but sometimes you might not find a model for the language pair you are interested in. When this happen, you can use a pretrained multilingual Translation model like mBART and further train it on your own data in a process called fine-tuning.
https://huggingface.co/tasks/translation
Token Classification

Token classification is a natural language understanding task in which a label is assigned to some tokens in a text. Some popular token classification subtasks are Named Entity Recognition (NER) and Part-of-Speech (PoS) tagging. NER models could be trained to identify specific entities in a text, such as dates, individuals and places; and PoS tagging would identify, for example, which words in a text are verbs, nouns, and punctuation marks.
https://huggingface.co/tasks/token-classification
Text-Generation Generating text is the task of generating new text given another text. These models can, for example, fill in incomplete text or paraphrase.
Text Classification is the task of assigning a label or class to a given text. Some use cases are sentiment analysis, natural language inference, and assessing grammatical correctness.

About Text Classification
https://youtu.be/leNG9fN9FQU

Use Cases
Sentiment Analysis on Customer Reviews
You can track the sentiments of your customers from the product reviews using sentiment analysis models. This can help understand churn and retention by grouping reviews by sentiment, to later analyze the text and make strategic decisions based on this knowledge.https://huggingface.co/tasks/text-classification
Table Question Answering (Table QA) is the answering a question about an information on a given table.
About Table Question Answering
Use Cases
SQL execution
You can use the Table Question Answering models to simulate SQL execution by inputting a table.

Table Question Answering
Table Question Answering models are capable of answering questions based on a table.

Task Variants
This place can be filled with variants of this task if there's any.

Inference
You can infer with TableQA models using the 🤗 Transformers library.

https://huggingface.co/tasks/table-question-answering
Summarization is the task of producing a shorter version of a document while preserving its important information. Some models can extract text from the original input, while other models can generate entirely new text.

About Summarization
https://youtu.be/yHnr5Dk2zCI

Use Cases
Research Paper Summarization 🧐
Research papers can be summarized to allow researchers to spend less time selecting which articles to read. There are several approaches you can take for a task like this:

Use an existing extractive summarization model on the Hub to do inference.
Pick an existing language model trained for academic papers. This model can then be trained in a process called fine-tuning so it can solve the summarization task.
Use a sequence-to-sequence model like T5 for abstractive text summarization.
https://huggingface.co/tasks/summarization