HF-hub - Share and discover more about AI with social posts from the community.huggingface/OpenAi
Share and discover more about AI with social posts from the community.huggingface/OpenAi
Very Large Language Models and How to Evaluate Them
Large language models can now be evaluated on zero-shot classification tasks with Evaluation on the Hub!

Zero-shot evaluation is a popular way for researchers to measure the performance of large language models, as they have been shown to learn capabilities during training without explicitly being shown labeled examples. The Inverse Scaling Prize is an example of a recent community effort to conduct large-scale zero-shot evaluation across model sizes and families to discover tasks on which larger models may perform worse than their smaller counterparts.

dataset
LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?
While developing Docmatix, we noticed that fine-tuning Florence-2 on it yielded great performance on DocVQA, but resulted in low scores on the benchmark. To enhance performance, we had to fine-tune the model further on DocVQA to learn the syntax required for the benchmark. Interestingly, this additional fine-tuning seemed to perform worse according to human evaluators, which is why we primarily used it for ablation studies and released the model only trained on Docmatix for broader use.

Although the generated answers semantically align with the reference answers, as illustrated in Figure 1, they still receive low scores. This raises these questions: Should we fine-tune the models to improve these metrics, or should we develop new metrics that better align with human perception?
KoMT-Bench, a benchmark designed to evaluate the capability of language models in following instructions in Korean. KoMT-Bench is an in-house dataset created by translating MT-Bench [1] dataset into Korean and modifying some questions to reflect the characteristics and cultural nuances of the Korean language. After the initial translation and modification, we requested expert linguists to conduct a thorough review of our benchmark dataset.

To conduct evaluations on KoMT-Bench, please visit the official KoMT-Bench GitHub repository in which the evaluation scripts are provided.
WikiRAG-TR is a dataset of 6K (5999) question and answer pairs which synthetically created from introduction part of Turkish Wikipedia Articles. The dataset is created to be used for Turkish Retrieval-Augmented Generation (RAG) tasks.

Dataset Information
Number of Instances: 5999 (5725 synthetically generated question-answer pairs, 274 augmented negative samples)
Dataset Size: 20.5 MB
Language: Turkish
Dataset License: apache-2.0
Dataset Category: Text2Text Generation
Dataset Domain: STEM and Social Sciences
WikiRAG-TR Pipeline
The creation of the dataset was accomplished in two main phases, each represented by a separate diagram.
Dataset Card for MedTrinity-25M
MedTrinity-25M, a comprehensive, large-scale multimodal dataset for medicine, covering over 25 million images across 10 modalities, with multigranular annotations for more than 65 diseases. These enriched annotations encompass both global textual information, such as disease/lesion type, modality, region-specific descriptions, and inter-regional relationships, as well as detailed local annotations for regions of interest (ROIs), including bounding boxes, segmentation masks. Compared to existing datasets, MedTrinity-25M provides the most enriched annotations, supporting a comprehensive range of multimodal tasks such as captioning and report generation, as well as vision-centric tasks like classification and segmentation. This dataset can be utilized to support large-scale pre-training of multimodal medical AI models, contributing to the development of future foundation models in the medical domain.

Homepage: https://github.com/yunfeixie233/MedTrinity-25M
Paperlink: https://arxiv.org/abs/2408.02900
Github Repo: https://github.com/UCSC-VLAA/MedTrinity-25M GitHub - yunfeixie233/MedTrinity-25M
This process allows the creation of two distinct datasets within Open-Critic-GPT:

Code-Preference-Pairs Dataset: (SFT) Contains pairs of duplicate code examples, with the only difference being one the rejected example has the bugged code 'surgically transplanted in' while the accepted is left the same.
Open-Critic-GPT Dataset: (DPO) Trains the model to find bugs and produce working code from broken code.
Both dataset's spans a total of 127 different language/structures, (some may have been lost in conversion started with 122k ended with 55k, due to lack of structured output, a finetuned model may preform better structured outputs.)
Both datasets contain of ~55K examples each (which both come from the same parent example)
Dataset Card for LLaVA-OneVision
!!! We are still uploading our dataset, stay tuned for final version, or contact [email protected] to get more details.

We provide the whole details of LLaVA-OneVision Dataset. In this dataset, we include the data splits used in the both final image stage and one-vision stage. For more details, please check our paper.

Dataset Sources
Dataset Collection: We include a few subsets from existing dataset collection Cambrian, Cauldron, UReader. Since we only used a few subsets from these datasets, and applied the cleaning and re-annotation process, we uploaded our processed version of these datasets into our own repository and thank the authors for providing the original datasets.
Other Datasets: For rest single source dataset, such as AI2D, OKVQA, we cite and link the original sources in our paper.
https://huggingface.co/datasets/lmms-lab/LLaVA-OneVision-Data lmms-lab/LLaVA-OneVision-Data · Datasets at Hugging Face
Dataset Card for magpie-ultra-v0.1
This dataset has been created with distilabel.
Dataset Summary
magpie-ultra it's a synthetically generated dataset for supervised fine-tuning using the new Llama 3.1 405B-Instruct model, together with other Llama models like Llama-Guard-3-8B and Meta-Llama-3.1-8B-Instruct.

The dataset contains challenging instructions and responses for a wide variety of tasks, such as Coding & debugging, Math, Data analysis, Creative Writing, advice seeking, or Brainstorming.

Explore the dataset in Argilla.
keras-io/wgan-molecular-graphs
Model description
This repo contains the model and the notebook for implementing a generative model for graphs and using it to generate novel molecules WGAN-GP with R-GCN for the generation of small molecular graphs.

Full credits go to Alexander Kensert

Reproduced by Vu Minh Chien

Motivation: The development of new drugs (molecules) can be extremely time-consuming and costly. The use of deep learning models can alleviate the search for good candidate drugs, by predicting the properties of known molecules (e.g., solubility, toxicity, affinity to the target protein, etc.). As the number of possible molecules is astronomical, the space in which we search for/explore molecules is just a fraction of the entire space. Therefore, it's arguably desirable to implement generative models that can learn to generate novel molecules (which would otherwise have never been explored).
Butterfly GAN
Model description
Based on paper: Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis

which states: "Notably, the model converges from scratch with just a few hours of training on a single RTX-2080 GPU, and has a consistent performance, even with less than 100 training samples"

also dubbed the Light-GAN model. This model was trained using the script here which is adapted from the lucidrains repo.

Differently from the script above, I used the transforms from the official repo. Because our training images were already cropped and aligned. official paper implementation repo
https://huggingface.co/ceyda/butterfly_cropped_uniq1K_512 ceyda/butterfly_cropped_uniq1K_512 · Hugging Face
Generate fauvism still life image using FastGAN
Model description
FastGAN model is a Generative Adversarial Networks (GAN) training on a small amount of high-fidelity images with minimum computing cost. Using a skip-layer channel-wise excitation module and a self-supervised discriminator trained as a feature-encoder, the model was able to converge after some hours of training for either 100 high-quality images or 1000 images datasets.

This model was trained on a dataset of 124 high-quality Fauvism painting images.

How to use:https://huggingface.co/huggan/fastgan-few-shot-fauvism-still-life huggan/fastgan-few-shot-fauvism-still-life · Hugging Face
gaIA: Italian Landscape GAN Model
gaIA is the first Italian GAN model trained on satellite images of a selection of Italy's main glaciers, forests, lakes, rivers, and coasts that are most affected by climate change. It is usable for scientific and artistic purposes.

Dataset
Images: 12k
Image Format: 1024x1024
Source: Copernicus Sentinel 2A
Reference Years: 2017 – June 2024
Riffusion: Optimized for Mobile Deployment
State-of-the-art generative AI model used to generate spectrogram images given any text input. These spectrograms can be converted into audio clips
Generates high resolution spectrograms images from text prompts using a latent diffusion model. This model uses CLIP ViT-L/14 as text encoder, U-Net based latent denoising, and VAE based decoder to generate the final image.

This model is an implementation of Riffusion found here. This repository provides scripts to run Riffusion on Qualcomm® devices. More details on model performance across various devices, can be found here.
Stable-Diffusion: Optimized for Mobile Deployment
State-of-the-art generative AI model used to generate detailed images conditioned on text descriptions
Generates high resolution images from text prompts using a latent diffusion model. This model uses CLIP ViT-L/14 as text encoder, U-Net based latent denoising, and VAE based decoder to generate the final image.

This model is an implementation of Stable-Diffusion found here. This repository provides scripts to run Stable-Diffusion on Qualcomm® devices. More details on model performance across various devices, can be found here.
—Model Details—
-Model Type: Image generation
-Model Stats:
**Input: Text prompt to generate image
**QNN-SDK: 2.19
**Text Encoder Number of parameters: 340M
**UNet Number of parameters: 865M
**VAE Decoder Number of parameters: 83M
**Model size: 1GB
Score-Based Generative Modeling through Stochastic Differential Equations (SDE)
Creating noise from data is easy; creating data from noise is generative modeling. We present a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise, and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by slowly removing the noise. Crucially, the reverse-time SDE depends only on the time-dependent gradient field (\aka, score) of the perturbed data distribution.
The Keras code example on denoising diffusion implicit models (DDIM).
Model description
The model uses a U-Net with identical input and output dimensions. It progressively downsamples and upsamples its input image, adding skip connections between layers having the same resolution. The architecture is a simplified version of the architecture of DDPM. It consists of convolutional residual blocks and lacks attention layers. The network takes two inputs, the noisy images and the variances of their noise components, which it encodes using sinusoidal embeddings.
https://huggingface.co/keras-io/denoising-diffusion-implicit-models
Perturbed-Attention Guidance for SDXL
The original Perturbed-Attention Guidance for unconditional models and SD1.5
arameters
guidance_scale : guidance scale of CFG (ex: 7.5)

pag_scale : guidance scale of PAG (ex: 4.0)

pag_applied_layers: layer to apply perturbation (ex: ['mid'])

pag_applied_layers_index : index of the layers to apply perturbation (ex: ['m0', 'm1'])

Stable Diffusion XL Demo
Try it here
Simple DCGAN implementation in TensorFlow to generate CryptoPunks.
Project repository: CryptoGANs.

Usage
You can play with the HuggingFace space demo.

Or try it yourself