Share and discover more about AI with social posts from the community.huggingface/OpenAi
safetensors
v0.4.4
Provides functions to read and write safetensors which aim to be safer than their PyTorch counterpart. The format is 8 bytes which is an unsized int, being the size of a JSON header, the JSON header refers the dtype the shape and data_offsets which are the offsets for the values in the rest of the file.
Installation
Pip
You can install safetensors via the pip manager:

pip install safetensors
From source
For the sources, you need Rust

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Make sure it's up to date and using stable channel
rustup update
git clone https://github.com/huggingface/safetensors
cd safetensors/bindings/python
pip install setuptools_rust
pip install -e .
Getting started
import torch
from safetensors import safe_open
from safetensors.torch import save_file

tensors = {
"weight1": torch.zeros((1024, 1024)),
"weight2": torch.zeros((1024, 1024))
}
save_file(tensors, "model.safetensors")

tensors = {}
with safe_open("model.safetensors", framework="pt", device="cpu") as f:
for key in f.keys():
tensors[key] = f.get_tensor(key)
Python documentation
#tensorflow #pytorch #huggingface #tensors #safetensors GitHub - huggingface/safetensors: Simple, safe way to store and distribute tensors
InternLM team for shipping such brilliant model checkpoints!

Let's gooo! Intern LM 2.5 20B with Apache 2.0 license, up-to 1M context window & trained on copious amounts of synthetic data! ⚡️

> Beats Gemma 27B IT; MMLU: 73.5, MATH: 64.7
> Up-to 20% increase in the reasoning tasks from last iteration
> Support function calling and tool use
> Base & Instruct models released
> Along with the 20B they release 1.8B and 7B (both looking incredibly strong)
> Uses the same architecture as InternLM2
> Integrated with Transformers (remote code) 🤗

> Interesting bit: they use some form of iterative process to generate synthetic data, train and improve (would love to know more about this)https://huggingface.co/collections/internlm/internlm25-66853f32717072d17581bc13 InternLM2.5 - a internlm Collection
Just released: Shining Valiant 2 for Llama 3.1 8b! 2024

- the first SV at 8b size, using the best 8b model
- newest version of the SV dataset improves specialist knowledge and response consistency

3.1 70b will be coming but our next releases will focus on expanding the Build Tools lineup. Get ready for some open-source synthetic datasets made with 3.1 405, coming VERY soon :)
Prompting Guide
Shining Valiant 2 uses the Llama 3.1 Instruct prompt format. The example script below can be used as a starting point for general chat:

import transformers import torch

model_id = "ValiantLabs/Llama3.1-8B-ShiningValiant2"

pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto", )

messages = [ {"role": "system", "content": "You are Shining Valiant, a highly capable chat AI."}, {"role": "user", "content": "Describe the role of transformation matrices in 3D graphics."} ]
https://huggingface.co/ValiantLabs/Llama3.1-8B-ShiningValiant2

outputs = pipeline( messages, max_new_tokens=1024, )

print(outputs[0]["generated_text"][-1]) ValiantLabs/Llama3.1-8B-ShiningValiant2 · Hugging Face
Scalable Nested Optimization for Deep Learning
⚡️ My PhD thesis, “Scalable Nested Optimization for Deep Learning,” is now on arXiv! ⚡️

tl;dr: We develop various optimization tools with highlights, including:
· Making the momentum coefficient complex for adversarial games like GANs.
· Optimizing millions of hyperparameters using implicit differentiation.
· Tuning hyperparameters using hypernetworks.
· Differentiably finding bifurcations in optimization for diverse solutions.

https://arxiv.org/abs/2407.01526
Segment Anything 2 Demo-meta

SAM 2 from Meta FAIR is the first unified model for real-time, promptable object segmentation in images & videos. Using the model in our web-based demo you can segment, track and apply effects to objects in video in just a few clicks.
https://sam2.metademolab.com/ SAM 2 Demo | By Meta FAIR
Really cool to see that SF3D is trending on Huggingface. They created an amazing system for setting up the demos super easily and even extending Gradio was fairly straightforward - I’ve done a relightable viewer for it.

https://huggingface.co/spaces/stabilityai/stable-fast-3d and viewer https://pypi.org/project/gradio-litmodel3d/ Stable Fast 3D - a Hugging Face Space by stabilityai
Introducing Idefics 3 8B Llama 3, Apache 2.0 licensed VLM with enhanced Document QA capabilities! ⚡️

> Vision backbone: SigLip, Text backbone: Llama 3.1 8B
> Text + Image input w/ text output
> 8.5B parameter model
> Supports up to 10K context
> Apache 2.0 licensed
> DocVQA
link:https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3
New multimodal release: Idefics3!

Adding vision to Llama 3.1 8b 👀
Strong improvement over April's Idefics2: +14 points on DocVQA, +6 points on MathVista 🧠
Interleave up to 60 images with text! 🤯
Comparable performance to the unreleased Llama 3.1 8B multimodal 🦾
8B-parameters: runs natively in one A100 🤏
Open license: Apache 2.0 🤗
Transparent training data: Ethically sourced datasets, built for the community 🥳Use it today with our branch of transformers: https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3
and our open weights: HuggingFaceM4/Idefics3-8B-Llama3 · Hugging Face
Haiyan Zhang:Fortifying Teams with AI and Optimized Workflows
Last week, I had an opportunity to speak at SIGGRAPH, one of the computer graphics industry’s premier events that focuses on research, education, and skill development. I spoke with Munkhtsetseg Nandigjav, Associate Dean School of Animation & Motion at Savannah College of Art and Design, about my role as the General Manager for Gaming AI at Microsoft Gaming, our Responsible AI framework, and the ways that leaders can support their teams with AI to adapt to an ever-changing industry landscape.

Before I dive into the specifics of AI for Gaming and how I believe it can help change the industry we love for the better, I want to share a bit about my own background and why this matters so much to me.https://developer.microsoft.com/en-us/games/articles/2024/08/fortifying-teams-with-ai-and-optimized-workflows/
New strategies in fight against AI deepfakes form google
Google (NASDAQ: GOOGL) has introduced new policy updates to intensify its fight against artificial intelligence (AI)- generated content portraying individuals in explicit contexts without their permission.

In a statement, the tech giant disclosed that it will demote results of explicit deepfakes in Google Search to protect victims from bad actors amid a spike in offensive incidents. Google says the latest tools against deepfakes are an improvement on its existing policies with the most drastic change being the ease of filing complaints.

While victims have always enjoyed the right to request takedowns of non-consensual fake content from Google Search, the latest improvements allow for easy reporting of offensive websites. Google’s statement disclosed that the company will remove duplicates of the derogatory content on the web, building on its experiments with other illicit content.

“These efforts are designed to give people added peace of mind, especially if they’re concerned about similar content about them popping up in the future,” read the statement.

The second weapon in Google’s arsenal against deepfakes is an improvement in the Search ranking system. Google believes its decision to build systems to rank quality information at the top of Search may be “the best protection against harmful content.”

Going forward, the search giant unveiled plans to push AI-generated NSFW (not safe for work) content lower on its rankings to stifle its distribution. For searches involving specific names, Google says it will promote high-quality, non-explicit content to drown out the exposure to AI-generated deepfakes.

There are plans to outrightly demote websites that have a slew of reports against them for AI deepfakes, smothering their circulation and distribution from the source.

The combination of the features is poised to reduce incidents by up to 70%, but the company notes that the fight is far from finished. For now, Google continues to grapple with deepfakes that are consensual from those made without the approval of an individual, as search engines are unable to make the distinction.

“These changes are major updates to our protections on Search, but there’s more work to do to address this issue, and we’ll keep developing new solutions to help people affected by this content,” said Google.https://coingeek.com/google-unveils-new-strategies-in-fight-against-ai-deepfakes/
UCSC-VLAA/MedTrinity-25M A Large-scale Multimodal Dataset
https://huggingface.co/papers/2408.02900
MedTrinity--25M is a large-scale multimodal dataset in the field of medicine.

Key Highlights
Dataset size and coverage: Covers more than 25 million images from 10 modalities with multi-granular annotations for more than 65 diseases.
Richness of annotations: Contains global textual information such as disease/lesion type, modality, region-specific descriptions and inter-regional relations, as well as detailed local annotations of regions of interest (ROIs) such as bounding boxes, segmentation masks.

Innovative data generation: Developed the first automated pipeline to extend multimodal data by generating multi-granular visual and textual annotations (in the form of image-ROI-description triplets) without image-text pairs.
Data collection and processing: Collected and preprocessed data from more than 90 different sources, and identified ROIs associated with abnormal regions using domain-specific expert models.
SwarmUI startup and creation speed

Maybe because it is based on ComfyUI, but SwarmUI startup and creation speed is quite fast
I can't simply compare it with the A1111 version of SD-webui, but for now, if you want to run SDXL, this is much more comfortable
FLUX.1 [schnell] is a 12 billion parameter rectifier transformer

that generates images from text descriptions.

It has many features and uses, but also some limitations and prohibited uses.

Highlights
Powerful generation capabilities: can generate images from text descriptions, output quality is cutting-edge, prompt following is competitive, and generates high-quality images in 1 to 4 steps.

Training method: trained by latent adversarial diffusion distillation.
License and use: licensed under apache-2.0 for personal, scientific and commercial purposes, with reference implementation, sample code and API endpoints.
Usage: can be used through specific github repositories, ComfyUI, Diffusers, etc., and corresponding code examples are given.

Limitations: cannot provide factual information, may amplify social biases, may not match prompt generation output, prompt following is affected by prompt style.
Prohibited use: It cannot be used to violate laws and regulations, harm minors, generate false and harmful information, disseminate personally identifiable information, harass or bully others, create illegal content, make fully automated decisions that affect personal legal rights, or create false information on a large scale.
https://huggingface.co/black-forest-labs/FLUX.1-schnell black-forest-labs/FLUX.1-schnell · Hugging Face
FLUX.1 [dev] is a 12 billion parameter rectifier transformer that can generate images from text descriptions.

Highlights
Powerful generation capabilities: It can generate images from text descriptions, and the output quality is cutting-edge, second only to the FLUX.1 [pro] model.

Excellent performance: It has competitive prompt following capabilities, matching the performance of closed-source alternatives.

Efficient training: Training is improved through guided distillation.

Open weight usage: Open weights to promote new scientific research and empower artists to develop innovative workflows.

Multiple usage paths: It provides reference implementations, sample code, can be obtained from multiple sources through API, can also be used for local reasoning in Comfy UI, and can be used with diffusers library.

Restrictions on use: It cannot provide factual information, may amplify social bias, may not match prompts, and is greatly affected by prompt style.

Prohibited uses: It clarifies a series of prohibited uses that violate laws and regulations, harm others, etc.

Followed license: Follows the FLUX.1 [dev] non-commercial license.
https://huggingface.co/black-forest-labs/FLUX.1-dev black-forest-labs/FLUX.1-dev · Hugging Face
this is amazing because Flux Schnell is, well, super-schnell (aka fast) and amazing at prompt following, while the IPA SDXL step gives it all the texture and style you would ever need

you can use this in the Glif Chrome Extension (pick the Flux Guided Style Transfer preset) or on glif:

https://chromewebstore.google.com/detail/glif-remix-the-web-with-a/abfbooehhdjcgmbmcpkcebcmpfnlingo

or here directly: https://glif.app/@fab1an/glifs/clzjsqemt00006mh13re8uu3b