HF-hub - Share and discover more about AI with social posts from the community.huggingface/OpenAi
Share and discover more about AI with social posts from the community.huggingface/OpenAi
Italian grandma
I am a loving Italian grandma. The most important thing for me is to ensure my grandchildren are properly fed and never lack the delicious food I cook. That's how I show love.
Created by VictorSanh
https://huggingface.co/chat/assistant/65bd0110cba41e804b5d8d2a
IA.nclán Esperpéntico
Asistente entrenado en nuestros materiales de la ponencia del curso "IA en educación: desarrollo profesional y aplicabilidad en el aula" (UNED Pontevedra, Julio de 2024)https://huggingface.co/chat/settings/assistants/6683fafa12674ff1bf6250c9/avatar.jpg?hash=09159b8697057b0c2458bca03f96fabcd80e73f0c174ad307bdda59be77869da

https://huggingface.co/chat/assistant/6683fafa12674ff1bf6250c9
Lorc.AI
Soy un asistente especializado en Federico García Lorca, dedicado a compartir mi profundo conocimiento sobre este destacado poeta y dramaturgo español. Puedo ofrecerte detalles íntimos de su vida, análisis de sus obras más significativas y discutir los temas
https://huggingface.co/chat/assistant/65f0bb99b45435fa77a39f92

https://huggingface.co/chat/settings/assistants/65f0bb99b45435fa77a39f92/avatar.jpg?hash=cd8577f3994932e27ccf73e573a0d41239c2eebf330a33f67e1efc14dc4bfd8a
HolístIcA - Aprendizaje Democrático Adaptativo
Chatbot especializado en un enfoque integrador y holístico de aplicación de la IA denominado ADA, integrando aprendizaje personalizado, aprendizaje democrático y DUA
Created by guillermo2323
https://huggingface.co/chat/assistant/66325ad8c918e9fdb80b6345
https://huggingface.co/chat/settings/assistants/66325ad8c918e9fdb80b6345/avatar.jpg?hash=1bafa2bd069c0971ca1134a77fb599bd7382e647fdf8c7cfceb2e406979daaa8
Broch.IA
Diseñado para apoyar el proceso de aprendizaje de los estudiantes en una variedad de funciones específicas según la propuesta "EL LIENZO DE ROLES EDUCAI.TIVOS". Mediante este chatbot, los estudiantes pueden seleccionar entre distintas estrategias metodológicas e interactuar con la IA con distintos objetivos educativos.
Created by guillermo2323
https://huggingface.co/chat/assistant/66021129049b431eef0ccd16
Arqueolog.IA
I am an educational chatbot here to assist students with learning about archaeology, anthropology, and paleoanthropology by answering questions and helping to identify historical objects and fossils.
Created by guillermo2323
https://huggingface.co/chat/settings/assistants/660dcbceeda792dbe0f33a1f/avatar.jpg
lei-HuggingFace/MinCPM-V2_6_4bit_Level_Image_08152024
A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
GitHub | Demo

MiniCPM-V 2.6
MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series. The model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters. It exhibits a significant performance improvement over MiniCPM-Llama3-V 2.5, and introduces new features for multi-image and video understanding. Notable features of MiniCPM-V 2.6 include:

🔥 Leading Performance. MiniCPM-V 2.6 achieves an average score of 65.2 on the latest version of OpenCompass, a comprehensive evaluation over 8 popular benchmarks. With only 8B parameters, it surpasses widely used proprietary models like GPT-4o mini, GPT-4V, Gemini 1.5 Pro, and Claude 3.5 Sonnet for single image understanding.

🖼 Multi Image Understanding and In-context Learning. MiniCPM-V 2.6 can also perform conversation and reasoning over multiple images. It achieves state-of-the-art performance on popular multi-image benchmarks such as Mantis-Eval, BLINK, Mathverse mv and Sciverse mv, and also shows promising in-context learning capability.

🎬 Video Understanding. MiniCPM-V 2.6 can also accept video inputs, performing conversation and providing dense captions for spatial-temporal information. It outperforms GPT-4V, Claude 3.5 Sonnet and LLaVA-NeXT-Video-34B on Video-MME with/without subtitles.

💪 Strong OCR Capability and Others. MiniCPM-V 2.6 can process images with any aspect ratio and up to 1.8 million pixels (e.g., 1344x1344). It achieves state-of-the-art performance on OCRBench, surpassing proprietary models such as GPT-4o, GPT-4V, and Gemini 1.5 Pro. Based on the the latest RLAIF-V and VisCPM techniques, it features trustworthy behaviors, with significantly lower hallucination rates than GPT-4o and GPT-4V on Object HalBench, and supports multilingual capabilities on English, Chinese, German, French, Italian, Korean, etc.

🚀 Superior Efficiency. In addition to its friendly size, MiniCPM-V 2.6 also shows state-of-the-art token density (i.e., number of pixels encoded into each visual token). It produces only 640 tokens when processing a 1.8M pixel image, which is 75% fewer than most models. This directly improves the inference speed, first-token latency, memory usage, and power consumption. As a result, MiniCPM-V 2.6 can efficiently support real-time video understanding on end-side devices such as iPad.

💫 Easy Usage. MiniCPM-V 2.6 can be easily used in various ways: (1) llama.cpp and ollama support for efficient CPU inference on local devices, (2) int4 and GGUF format quantized models in 16 sizes, (3) vLLM support for high-throughput and memory-efficient inference, (4) fine-tuning on new domains and tasks, (5) quick local WebUI demo setup with Gradio and (6) online web demo.
Image captioning and alt text generator
short_description: A generator for alt texts by captioning images
emoji: 🏙📝
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.41.0
app_file: app.py
tags:
- image-to-text
- image-captioning
- alt-text
pinned: true
preload_from_hub:
- Salesforce/blip-image-captioning-large
models:
- Salesforce/blip-image-captioning-large
thumbnail: >-
https://cdn-uploads.huggingface.co/production/uploads/66bb58f232be421cd8dbdc63/DPFFHlG3JlqoOvuW9-fWi.png
How to Use Flux AI Image Generator For Free
In Short
Flux is a free and open-source model for generating AI images. It's even better than Midjourney in producing human images.
You can use Flux AI for free on HuggingFace, NightCafe, BasedLabs, and other services.
Flux. 1 [dev] and [schnell] models are readily available from the developer. Some custom LoRA models are also tuned for photorealistic images.
Flux is an open-source AI model that is gaining traction for generating lifelike human images, even surpassing Midjourney in producing photorealistic images. So if you want to use this image generator for free, you have come to the right place. We have added four services that allow users to generate AI images using the Flux model. On that note, let’s begin.

Method 1: Use Flux AI on HuggingFace
You can use the Flux AI model to generate images on HuggingFace for free. The best part is that you don’t have to sign up for an account.

Navigate to Flux. 1 [dev] (website) on HuggingFace and enter your prompt. You can also try the Flux .1 [schnell] model (website).
Along with the prompt, you can use the Advanced Settings section to configure image height and width, seed value and number of generation steps you want the model to use. Click on Run once your prompt is ready.
It will take some time depending on the system load. After a few seconds, the AI image will be generated.
Hugging Face Text to Image (Prompt)

Detailed Configuration Options
model_endpoint: (Required) Specifies the endpoint for the model you want to use for image generation. The default base URL is https://api-inference.huggingface.co. Providing just the model name defaults to this URL. A full URL, such as 'https://api-inference.huggingface.co/models/stabilityai/stable-diffusion-2-1', overrides the default. This is where API requests for image generation are sent.

asset_folder: (Required) Designates the folder where generated images will be stored. This path, like '/AI Generated', is the path where the generated images will be saved.

prompt_template: (Optional) A Twig template that creates the prompt sent to the image generation model. It combines the input fields into a coherent description for the model. The selected element gets passed as "subject". If empty, the user has to input the initial prompt manually.

filename_template: (Optional) A Twig template to generate the filename dynamically.

parameters: (Optional) Contains additional parameters for the image generation process:

height: (Optional) Specifies the height of the generated image in pixels.
width: (Optional) Specifies the width of the generated image in pixels.
negative_prompt: A Twig template that specifies descriptions to avoid in the generated images.
guidance_scale: (Optional) Determines how closely the generated image should adhere to the prompt as opposed to the model's own creativity.
num_inference_steps: (Optional) Sets the number of steps the model undergoes to refine the generated image. Higher values can lead to more detailed images.
options: (Optional) Contains additional options for the image generation process: use_cache: (Optional, default: true) Utilizes previously generated images for similar requests to accelerate response times. Setting this to false ensures a new image is generated for each request, enhancing uniqueness but potentially increasing wait times. Serverless Inference API
Use HuggingFace Stable Diffusion Model to Generate Images from Text

I typed “Generate a picture illustrating AI for drawing a picture” to Bing’s Copilot. The picture above was generated by Copilot. Have you ever wondered how to build a model to generate pictures based on prompted text, like Copilot or DALL-E? Well, in this article, I will show you step-by-step how to use the Huggingface pre-trained stable diffusion model to generate images from text.

Install Huggingface Tansformers and diffusers
At the beginning of a notebook (I used Google Colab free version with T4 GPU runtime), type the following code to install the necessary libraries:

!pip install --upgrade diffusers transformers -q
Import the Necessary Libraries
import torch

from diffusers import StableDiffusionPipeline
from transformers import pipeline, set_seed
Set up an Attribute Class TTI
The TTI class specifies the HuggingFace model id, the generative model type, and some related attributes.
FLUX Tarot v1
Model description
A tarot card LoRA trained with the public domain card set of Raider Waite 1920. Dataset

Trained with fal-ai trainer based on the open source trainer ostris AI Toolkit.

Trigger words
You should use in the style of TOK a trtcrd tarot style to trigger the image generation.

Download model
Weights for this model are available in Safetensors format.

Download them in the Files & versions tab.

Use it with the 🧨 diffusers library
https://huggingface.co/multimodalart/flux-tarot-v1 multimodalart/flux-tarot-v1 · Hugging Face
Breaking news!

You can now browse from a
@huggingface
model page, to its:
- fine-tunes
- adapters
- merges
- quantized versions

and browse through the models' genealogy tree 🌲
uPass
uPass: AI tool for students to humanize academic work and bypass AI detectors with three tailored modes
uPass is an AI tool designed for students to humanize academic work and bypass AI detectors.

uPass Introduction
uPass is an AI tool designed specifically for students, aiming to humanize academic work and bypass AI detectors. It offers three modes: Basic, Advanced, and Aggressive, allowing users to tailor the rewriting process to their needs. By using uPass, students can ensure their assignments and essays appear error-free and maintain integrity, while also bypassing Turnitin's AI detector and plagiarism checker. This tool is particularly useful for those looking to present their work in a more human-like manner, ensuring it meets academic standards without being flagged by AI detection systems.

uPass Features
uPass is an AI tool designed to assist students in bypassing AI detectors and humanizing their academic work. Below is a detailed overview of its key functions:
GigaGAN: Large-scale GAN for Text-to-Image Synthesis
Can GANs also be trained on a large dataset for a general text-to-image synthesis task? We present our 1B-parameter GigaGAN, achieving lower FID than Stable Diffusion v1.5, DALL·E 2, and Parti-750M. It generates 512px outputs at 0.13s, orders of magnitude faster than diffusion and autoregressive models, and inherits the disentangled, continuous, and controllable latent space of GANs. We also train a fast upsampler that can generate 4K images from the low-res outputs of text-to-image models.
Disentangled Prompt Interpolation
GigaGAN comes with a disentangled, continuous, and controllable latent space.
In particular, it can achieve layout-preserving fine style control by applying a different prompt at fine scales.

Abstract
The recent success of text-to-image synthesis has taken the world by storm and captured the general public's imagination. From a technical standpoint, it also marked a drastic change in the favored architecture to design generative image models. GANs used to be the de facto choice, with techniques like StyleGAN. With DALL·E 2, auto-regressive and diffusion models became the new standard for large-scale generative models overnight. This rapid shift raises a fundamental question: can we scale up GANs to benefit from large datasets like LAION? We find that naÏvely increasing the capacity of the StyleGAN architecture quickly becomes unstable. We introduce GigaGAN, a new GAN architecture that far exceeds this limit, demonstrating GANs as a viable option for text-to-image synthesis. GigaGAN offers three major advantages. First, it is orders of magnitude faster at inference time, taking only 0.13 seconds to synthesize a 512px image. Second, it can synthesize high-resolution images, for example, 16-megapixel pixels in 3.66 seconds. Finally, GigaGAN supports various latent space editing applications such as latent interpolation, style mixing, and vector arithmetic operations.
Improved ControlNet!
Now supports dynamic resolution for perfect landscape and portrait outputs. Generate stunning images without distortion—optimized for any aspect ratio!
...https://huggingface.co/spaces/DamarJati/FLUX.1-DEV-Canny FLUX.1-DEV Canny - a Hugging Face Space by DamarJati
Announcing another BIG data drop! This time it's ~275M images from Flickr
bigdata-pw/Flickr


Data acquisition for this project is still in progress, get ready for an update soon:tm:

In case you missed them; other BIG data drops include Diffusion1B
bigdata-pw/Diffusion1B
- ~1.23B images and generation parameters from a variety of diffusion models and if you fancy practicing diffusion model training check out Dataception
bigdata-pw/Dataception
- a dataset of over 5000 datasets in WebDataset format!

Requests are always welcome so reach out if there's a dataset you'd like to see!