HF-hub - Share and discover more about AI with social posts from the community.huggingface/OpenAi
Share and discover more about AI with social posts from the community.huggingface/OpenAi
HuggingFaceM4 /Idefics3-8B-Llama3
Transformers version: until the next Transformers pypi release, please install Transformers from source and use this PR to be able to use Idefics3. TODO: change when new version.

Idefics3
Idefics3 is an open multimodal model that accepts arbitrary sequences of image and text inputs and produces text outputs. The model can answer questions about images, describe visual content, create stories grounded on multiple images, or simply behave as a pure language model without visual inputs. It improves upon Idefics1 and Idefics2, significantly enhancing capabilities around OCR, document understanding and visual reasoning.

We release the checkpoints under the Apache 2.0.

Model Summary
Developed by: Hugging Face
Model type: Multi-modal model (image+text)
Language(s) (NLP): en
License: Apache 2.0
Parent Models: google/siglip-so400m-patch14-384 and meta-llama/Meta-Llama-3.1-8B-Instruct
Resources for more information:
Idefics1 paper: OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Idefics2 paper: What matters when building vision-language models?
Idefics3 paper: Coming soon (TODO)https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3 HuggingFaceM4/Idefics3-8B-Llama3 · Hugging Face
AI Tools of the week
1.🎥 Guidde - Magically create video documentation that explains the most complex things through simple step-by-step guides.

2.💡 MuckBrass - Uncover market-validated startup ideas with AI-powered analysis of search trends and competition.

🤖 Splutter AI - Launch custom AI chatbots for websites, supercharging support, marketing, and sales with 24/7 automation.

📞 LangCall - Skip long waits with AI agents that navigate phone menus, handle conversations, and connect you only when needed.

🔍 MiniPerplx - Streamline web search with advanced functions like weather updates, event tracking, and literary analysis.

Sourcer AI - Combat online misinformation with real-time AI-powered fact-checking for instant credibility assessment.

📊 PPT GPTSci - Convert images into fully editable PowerPoint slides through a user-friendly interface with high-quality output.

📖 Narrative Nooks - Unlock personalized learning with interactive stories, 24/7 tutoring support, and engaging lessons.

💬 AI Chat Bot - Skyrocket sales with multilingual AI chatbots, offering easy setup, customizable features, and seamless integration.

🎶 Songifier - Identify songs instantly using lyric matching, bridging the gap between fragmentary lyric recall and full song access.

📊 Shortimize - Analyze and optimize shorts with cross-platform tracking, AI-powered viral video search, and in-depth analytics.

📚 GetQuiz - Turn reading into lasting knowledge with AI companion that generates quizzes directly in Telegram.

🎵 AudioStack - Revolutionize audio production with AI-powered tools that create professional-quality content 10,000x faster.

🐦 BioIt - Craft your perfect Twitter bio by answering a few simple questions, generating unique, catchy intros.

🎭 Immersim AI - Create and explore infinite interactive universes with seamless storytelling and dynamic character interactions.
ResShift 1-Click Windows, RunPod, Massed Compute, Kaggle Installers with Amazing Gradio APP and Batch Image Processing. ResShift is Efficient Diffusion Model for Image Super-resolution by Residual Shifting (NeurIPS 2023, Spotlight).


Official Repo : https://github.com/zsyOAOA/ResShift

I have developed a very advanced Gradio APP.

Developed APP Scripts and Installers : https://www.patreon.com/posts/110331752

Features

It supports following tasks:

Real-world image super-resolution

Bicubic (resize by Matlab) image super-resolution

Blind Face Restoration

Automatically saving all generated image with same name + numbering if necessary

Randomize seed feature for each generation

Batch image processing - give input and output folder paths and it batch process all images and saves

1-Click to install on Windows, RunPod, Massed Compute and Kaggle (free account)

Windows Requirements

Python 3.10, FFmpeg, Cuda 11.8, C++ tools and Git

If it doesn't work make sure to below tutorial and install everything exactly as shown in this below tutorial

https://youtu.be/-NjNy7afOQ0

How to Install on Windows

Make sure that you have the above requirements

Extract files into a folder like c:/reshift_v1

Double click Windows_Install.bat and it will automatically install everything for you with an isolated virtual environment folder (VENV)

After that double click Windows_Start_app.bat and start the app

When you first time use a task it will download necessary models (all under 500 MB) into accurate folders

If during download it fails, file gets corrupted sadly it doesn't verify that so delete files inside weights and restart

How to Install on RunPod, Massed Compute, Kaggle

Follow the Massed_Compute_Instructions_READ.txt and Runpod_Instructions_READ.txt

For Kaggle follow the notebook written steps

An example video of how to use my RunPod, Massed Compute scripts and Kaggle notebook can be seen

https://youtu.be/wG7oPp01COg GitHub - zsyOAOA/ResShift: ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting (NeurIPS@2023 Spotlight…
Introducing Hugging Face Similar: a Chrome extension to find relevant datasets!

Adds a "Similar Datasets" section to Hugging Face dataset pages
🔍 Recommendations based on dataset READMEs
🏗 Powered by https://huggingface.co/chromadb and https://huggingface.co/Snowflake embeddings.

You can try it here: https://chromewebstore.google.com/detail/hugging-face-similar/aijelnjllajooinkcpkpbhckbghghpnl?authuser=0&hl=en.

I am very happy to get feedback on whether this could be useful or not 🤗 chromadb (chroma)
Serving Meta Llama 3.1 405B on Google Cloud is now possible via the Hugging Face Deep Learning Containers (DLCs) for Text Generation Inference (TGI)

In this post, we showcase how to deploy
meta-llama/Meta-Llama-3.1-405B-Instruct-FP8
on an A3 instance with 8 x H100 GPUs on Vertex AI

Thanks to the Hugging Face DLCs for TGI and Google Cloud Vertex AI, deploying a high-performance text generation container for serving Large Language Models (LLMs) has never been easier. And we’re not going to stop here – stay tuned as we enable more experiences to build AI with open models on Google Cloud!

Read the full post at https://huggingface.co/blog/llama31-on-vertex-ai Deploy Meta Llama 3.1 405B on Google Cloud Vertex AI
How The Washington Post Uses AI to Empower Journalists 🔍📰

An exciting new example in the world of AI-assisted journalism! The Post has developed an internal tool called "Hayatacker" that's enhancing in-depth reporting. Here's why it matters:

🎥 What it does:
• Extracts stills from video files
• Processes on-screen text
• Labels objects in images

🗳 First big project:
Analyzed 745 Republican campaign ads on immigration (Jan-Jun 2024)

🤝 Human-AI collaboration:
• AI extracts and organizes data
• Reporters verify and analyze findings

🔎 Thorough approach:
• Manual review of all 745 ads
• Reverse image searches when context is lacking
• Cross-referencing with AdImpact transcripts

💡 Key insight from WaPo's Senior Editor for AI strategy Phoebe Connelly:
"The more exciting choice is putting AI in the hands of reporters early on in the process."

This tool showcases how AI can augment journalistic capabilities without replacing human insight and verification. It's a powerful example of technology enhancing, not replacing, traditional reporting skills.

👉 Read the full article and the methodology: https://www.washingtonpost.com/elections/interactive/2024/republican-campaign-ads-immigration-border-security/ Republicans flood TV with misleading ads about immigration, border
How to Use SwarmUI & Stable Diffusion 3 on Cloud Services Kaggle (free),
In this video, I demonstrate how to install and use #SwarmUI on cloud services. If you lack a powerful GPU or wish to harness more GPU power, this video is essential. You’ll learn how to install and utilize SwarmUI, one of the most powerful Generative AI interfaces, on Massed Compute, RunPod, and Kaggle (which offers free dual T4 GPU access for 30 hours weekly). This tutorial will enable you to use SwarmUI on cloud GPU providers as easily and efficiently as on your local PC. Moreover, I will show how to use Stable Diffusion 3 (#SD3) on cloud. SwarmUI uses #ComfyUI backend.
🔗 The Public Post (no login or account required) Shown In The Video With The Links ➡️ https://www.patreon.com/posts/stableswarmui-3-106135985
🔗 Windows Tutorial for Learn How to Use SwarmUI ➡️ https://youtu.be/HKX8_F1Er_w
🔗 How to download models very fast to Massed Compute, RunPod and Kaggle and how to upload models or files to Hugging Face very fast tutorial ➡️ https://youtu.be/X5WVZ0NMaTg
🔗 SECourses Discord ➡️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
🔗 Stable Diffusion GitHub Repo (Please Star, Fork and Watch) ➡️ https://github.com/FurkanGozukara/Stable-Diffusion SwarmUI Master Tutorial - Use Stable Diffusion 3 (SD3) and FLUX model with Amazing Performance | SECourses: Tutorials, Guides,…
Basic Usage of StableSwarmUI
So you want to know how to get started with StableSwarmUI, huh? It's easy!

For the most part, just download the installer and follow the instructions on screen. Everything explains itself, even the settings and parameters all have ? clickables that explain what they do!

Nonetheless, here's a step-by-step you can follow:

Installing
Step one: Install StableSwarmUI.

Once you've ran the basic program-installation, if all went well, it will open a web interface to select basic install settings.

Agree to the SD license
Pick a theme (I think default is best, but you got options)
Pick who the UI is for (usually just Yourself or Yourself on LAN)
Pick what backend(s) to install. If you already have ComfyUI or another backend you can skip this - if not, pick one. I recommend ComfyUI for local usage.
Pick any model(s) you want to download. If you already have some you can skip this, if not, I recommend SDXL 1.0.
Confirm you want the settings you selected, and install.
Once this is done, it should automatically redirect you to the main interface.

(You can close the server at any time by just closing that console window it pulls up, and you can start it again via the desktop icon, or the launch script in the folder).
https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/Basic%20Usage.md StableSwarmUI/docs/Basic Usage.md at master · Stability-AI/StableSwarmUI
More on NVIDIA NIM at SIGGRAPH

At SIGGRAPH, NVIDIA also introduced generative AI models and NIM microservices for the OpenUSD framework to accelerate developers’ abilities to build highly accurate virtual worlds for the next evolution of AI.

To experience more than 100 NVIDIA NIM microservices with applications across industries, visit ai.nvidia.com.

Categories: Cloud | Deep Learning | Generative AI
Tags: #Artificial Intelligence | #DGX Cloud | #NVIDIA NIM | #SIGGRAPH 2024
Near-Instant Access to DGX Cloud Provides Accessible AI Acceleration

The NVIDIA DGX Cloud platform is purpose-built for generative AI, offering developers easy access to reliable accelerated computing infrastructure that can help them bring production-ready applications to market faster.

The platform provides scalable GPU resources that support every step of AI development, from prototype to production, without requiring developers to make long-term AI infrastructure commitments.

Hugging Face inference-as-a-service on NVIDIA DGX Cloud powered by NIM microservices offers easy access to compute resources that are optimized for AI deployment, enabling users to experiment with the latest AI models in an enterprise-grade environment.
What’s New in July 2024
We added 20+ models to the Hugging Face collection in the Azure AI model catalog in July. These included multilingual models (focus on Chinese, Dutch, Arabic, South-East Asian), embedding models, text generation (SLM and LLM) and models with a domain-specific focus (e.g., biomedical). The table below summarizes additions by task and notable features. Click model name to view related model cards on Azure AI for more details. In the next section, we’ll put the spotlight on a couple of models or model families that may be of particular interest to developers exploring SLMs or multilingual applications.
https://techcommunity.microsoft.com/t5/ai-ai-platform-blog/new-hugging-face-models-on-azure-ai-multilingual-slm-and-biomed/ba-p/4211881?wt.mc_id=twitter_4211881_organicsocial_reactor New Hugging Face Models on Azure AI: Multilingual, SLM and BioMed- July 2024 Update
AI unicorn Hugging Face acquires XetHub to manage huge AI models, aiming to host hundreds of millions. Meta's Llama 3.1 has 405B parameters, driving the need for more scalable solutions. XetHub's tools for efficient data management will integrate into Hugging Face's platform. #AI
Brief backstory: Before diving into AI, I spent over a decade working in ecological fields such as the conservation corps, biodynamic farming, and natural habitat restoration. This background instilled in me a deep concern about the environmental impact of scaling AI without sustainable practices.

Driven by this concern, I've spent months planning and experimenting to make my AI work more eco-friendly. I'm thrilled to announce that I've successfully transitioned my entire operation to run on 100% sustainable solar power!
How good are you at spotting AI-generated images?

Find out by playing Fake Insects 🐞 a Game where you need to identify which insects are fake (AI generated). Good luck & share your best score in the comments!
🚀 We’re excited to launch Ghost 8B Beta (1608), a top-performing language model with unmatched multilingual support and cost efficiency.

Key Highlights:
- Superior Performance: Outperforms Llama 3.1 8B Instruct, GPT-3.5 Turbo, Claude 3 Opus, GPT-4, and more in winrate scores.
- Expanded Language Support: Now supports 16 languages, including English, Vietnamese, Spanish, Chinese, and more.
- Enhanced Capabilities: Improved math, reasoning, and instruction-following for better task handling.

With two context options (8k and 128k), Ghost 8B Beta is perfect for complex, multilingual applications, balancing power and cost-effectiveness.

🔗 Learn More: https://ghost-x.org/docs/models/ghost-8b-beta
ghost-x/ghost-8b-beta-668ead6179f93be717db4542 Ghost 8B Beta