Share and discover more about AI with social posts from the community.huggingface/OpenAi
Ghost 8B Beta 1608: Empowering Your AI Assistant
๐ฆ Unlock the Power of Ghost 8B Beta 1608: Build Your Personal AI Companion
Ghost 8B Beta 1608 empowers you to create a safe and multilingual AI assistant tailored to your needs, directly on your personal computer. ๐งโ๐ป Leverage AI's capabilities within your own space! ๐ Ghost 8B Beta 1608 is ready to become your AI companion.
~
๐ฆ ๊ฐ์ธ์ฉ AI ๋ณด์กฐ ๋๊ตฌ๋ก Ghost 8B Beta 1608๋ฅผ ํ์ฉํ์ธ์!
Ghost 8B Beta 1608, AI์ ํ์ ํ์ฉํ์ฌ ์์ ํ๊ณ ๊ฐ์ธํ๋ ์ธ์ด ์ง์์ ์ ๊ณตํ๋ AI ๋ณด์กฐ ๋๊ตฌ๋ฅผ ์ง์ ๊ตฌ์ถํ ์ ์์ต๋๋ค. ๐งโ๐ป ๊ฐ์ธ ์ปดํจํฐ์์ AI์ ํํ์ ๋๋ฆฌ์ธ์! ๐ Ghost 8B Beta 1608๋ ๋น์ ์ AI ํํธ๋๊ฐ ๋ ์ค๋น๊ฐ ๋์ด ์์ต๋๋ค.
lamhieu/ghost-8b-beta-8k
ghost-x/ghost-8b-beta-668ead6179f93be717db4542
๐ฆ Unlock the Power of Ghost 8B Beta 1608: Build Your Personal AI Companion
Ghost 8B Beta 1608 empowers you to create a safe and multilingual AI assistant tailored to your needs, directly on your personal computer. ๐งโ๐ป Leverage AI's capabilities within your own space! ๐ Ghost 8B Beta 1608 is ready to become your AI companion.
~
๐ฆ ๊ฐ์ธ์ฉ AI ๋ณด์กฐ ๋๊ตฌ๋ก Ghost 8B Beta 1608๋ฅผ ํ์ฉํ์ธ์!
Ghost 8B Beta 1608, AI์ ํ์ ํ์ฉํ์ฌ ์์ ํ๊ณ ๊ฐ์ธํ๋ ์ธ์ด ์ง์์ ์ ๊ณตํ๋ AI ๋ณด์กฐ ๋๊ตฌ๋ฅผ ์ง์ ๊ตฌ์ถํ ์ ์์ต๋๋ค. ๐งโ๐ป ๊ฐ์ธ ์ปดํจํฐ์์ AI์ ํํ์ ๋๋ฆฌ์ธ์! ๐ Ghost 8B Beta 1608๋ ๋น์ ์ AI ํํธ๋๊ฐ ๋ ์ค๋น๊ฐ ๋์ด ์์ต๋๋ค.
lamhieu/ghost-8b-beta-8k
ghost-x/ghost-8b-beta-668ead6179f93be717db4542
Looking for Generative AI trainer/speaker for AI accelerator program (Virtual/Online sessions).
To get more context about the program, please visit the program landing page: https://llamadesigndrive.com
To get more context about the program, please visit the program landing page: https://llamadesigndrive.com
a new shape-optimized SigLIP just dropped ๐
google/siglip-so400m-patch14-224
google/siglip-so400m-patch14-224
๐ Liger Kernel: Efficient Triton Kernels for LLM Training
LIGER "is a [Hugging Face-compatible] collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduces memory usage by 60%."
GitHub: https://github.com/linkedin/Liger-Kernel
LIGER "is a [Hugging Face-compatible] collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduces memory usage by 60%."
GitHub: https://github.com/linkedin/Liger-Kernel
Neural Network โ(1 Byte explainer for everybody)
Just like our brain, a Neural Network is made up of interconnected "neurons". These neurons work together by learning from (input) data and getting better at tasks (in the hidden layer) to give (output) predictions or decisions.
Just like our brain, a Neural Network is made up of interconnected "neurons". These neurons work together by learning from (input) data and getting better at tasks (in the hidden layer) to give (output) predictions or decisions.
๐๐๐ฎ๐ญ ๐ถ๐๐ฒ๐ฟ๐ฎ๐๐ฒ๐ ๐๐ถ๐๐ต ๐ป๐ฒ๐ ๐๐ฎ๐บ๐ฏ๐ฎ ๐ญ.๐ฑ ๐ฟ๐ฒ๐น๐ฒ๐ฎ๐๐ฒ: ๐ก๐ฒ๐ ๐๐๐ฎ๐ป๐ฑ๐ฎ๐ฟ๐ฑ ๐ณ๐ผ๐ฟ ๐น๐ผ๐ป๐ด-๐ฐ๐ผ๐ป๐๐ฒ๐
๐ ๐๐๐ฒ-๐ฐ๐ฎ๐๐ฒ๐!๐
@ai21labs used a different architecture to beat the status-quo Transformers models: Jamba architecture combines classic Transformers layers with the new Mamba layers, for which the complexity is a linear (instead of quadratic) function of the context length.
What does this imply?
โก๏ธ Jamba models are much more efficient for long contexts: faster (up to 2.5x faster for long context), takes less memory, and also performs better to recall everything in the prompt.
That means itโs a new go-to model for RAG or agentic applications!
And the performance is not too shabby: Jamba 1.5 models are comparable in perf to similar-sized Llama-3.1 models! The largest model even outperforms Llama-3.1 405B on Arena-Hard.
โ๏ธ Comes in 2 sizes: Mini (12B active/52B) and Large (94B active/399B)
๐ Both deliver 256k context length, for low memory: Jamba-1.5 mini fits 140k context length on one single A100.
โ๏ธ New quanttization method: Experts Int8 quantizes only the weights parts of the MoE layers, which account for 85% of weights
๐ค Natively supports JSON format generation & function calling.
๐ Permissive license *if your org makes <$50M revenue*
Available on the Hub ๐
ai21labs/jamba-15-66c44befa474a917fcf55251
Read their release blog post ๐ https://www.ai21.com/blog/announcing-jamba-model-family
@ai21labs used a different architecture to beat the status-quo Transformers models: Jamba architecture combines classic Transformers layers with the new Mamba layers, for which the complexity is a linear (instead of quadratic) function of the context length.
What does this imply?
โก๏ธ Jamba models are much more efficient for long contexts: faster (up to 2.5x faster for long context), takes less memory, and also performs better to recall everything in the prompt.
That means itโs a new go-to model for RAG or agentic applications!
And the performance is not too shabby: Jamba 1.5 models are comparable in perf to similar-sized Llama-3.1 models! The largest model even outperforms Llama-3.1 405B on Arena-Hard.
โ๏ธ Comes in 2 sizes: Mini (12B active/52B) and Large (94B active/399B)
๐ Both deliver 256k context length, for low memory: Jamba-1.5 mini fits 140k context length on one single A100.
โ๏ธ New quanttization method: Experts Int8 quantizes only the weights parts of the MoE layers, which account for 85% of weights
๐ค Natively supports JSON format generation & function calling.
๐ Permissive license *if your org makes <$50M revenue*
Available on the Hub ๐
ai21labs/jamba-15-66c44befa474a917fcf55251
Read their release blog post ๐ https://www.ai21.com/blog/announcing-jamba-model-family
The latest timm validation & test set results are now viewable by a leaderboard space:
timm/leaderboard
As of yesterday, I updated all of the results for ImageNet , ImageNet-ReaL, ImageNet-V2, ImageNet-R, ImageNet-A, and Sketch sets. The csv files can be found in the GH repo https://github.com/huggingface/pytorch-image-models/tree/main/results
Unfortunately the latest benchmark csv files are not yet up to date, there are some gaps in dataset results vs throughput/flop numbers impact the plots.
h/t to @MohamedRashad for making the first timm leaderboard.
timm/leaderboard
As of yesterday, I updated all of the results for ImageNet , ImageNet-ReaL, ImageNet-V2, ImageNet-R, ImageNet-A, and Sketch sets. The csv files can be found in the GH repo https://github.com/huggingface/pytorch-image-models/tree/main/results
Unfortunately the latest benchmark csv files are not yet up to date, there are some gaps in dataset results vs throughput/flop numbers impact the plots.
h/t to @MohamedRashad for making the first timm leaderboard.
Huge updates and improvements for FLUX LoRA training : https://www.patreon.com/posts/kohya-flux-lora-110293257
10 GB, 16 GB, 24 GB and 48 GB GPU configs added - 10 GB config is like 3x to 5x slower sadly
Massed Compute, RunPod and Windows Kohya SS GUI LoRA installers added to the zip file
Also right now testing new 16 GB FLUX LoRA training config and new way of regularization images. Moreover testing Apply T5 Attention Mask too. Lets see if Kohya FLUX LoRA workflow will become even better or not
Also massive grids comparisons shared here : https://www.reddit.com/r/StableDiffusion/comments/1eyj4b8/kohya_ss_gui_very_easy_f
10 GB, 16 GB, 24 GB and 48 GB GPU configs added - 10 GB config is like 3x to 5x slower sadly
Massed Compute, RunPod and Windows Kohya SS GUI LoRA installers added to the zip file
Also right now testing new 16 GB FLUX LoRA training config and new way of regularization images. Moreover testing Apply T5 Attention Mask too. Lets see if Kohya FLUX LoRA workflow will become even better or not
Also massive grids comparisons shared here : https://www.reddit.com/r/StableDiffusion/comments/1eyj4b8/kohya_ss_gui_very_easy_f
Shoutout to everyone who participated in BigScience! Doesn't get enough credit but IMO paved the way for open-source LLMs!
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (2211.05100)
bigscience/bloom
bigscience/bloomz
https://huggingface.co/bigscience/bloom
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (2211.05100)
bigscience/bloom
bigscience/bloomz
https://huggingface.co/bigscience/bloom
ACL 2024: The Missing Papers
Apparently, some papers from the ACL 2024 are still not listed in the ACL Anthology. While this issue will hopefully be fixed soon, we should give those papers additional spotlight.
Some of my favorites:
1. Dolma is an English corpus that encompasses 3 trillion tokens. Additionally, it is accompanied by an exceptional software package that consdierably advances the state-of-the-art in preparing data for LLM pretraining. (Source: I am currently using Dolma.)
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research (2402.00159)
2. In the paper "Same Task, More Tokens: the Impact of Input Length on
the Reasoning Performance of Large Language Models", the authors show how extending the context length impacts an LLM's reasoning performance. I asked myself a similar question a few months ago, and therefore this paper is highly interesting to me.
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models (2402.14848)
This was brought to my attention through a Linkedin post by @ShayeghB, who is also affected:
Ensemble-Based Unsupervised Discontinuous Constituency Parsing by Tree Averaging (2403.00143)
View all the missing papers here:
https://theshayegh.github.io/ACL2024MissingPapers/
Apparently, some papers from the ACL 2024 are still not listed in the ACL Anthology. While this issue will hopefully be fixed soon, we should give those papers additional spotlight.
Some of my favorites:
1. Dolma is an English corpus that encompasses 3 trillion tokens. Additionally, it is accompanied by an exceptional software package that consdierably advances the state-of-the-art in preparing data for LLM pretraining. (Source: I am currently using Dolma.)
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research (2402.00159)
2. In the paper "Same Task, More Tokens: the Impact of Input Length on
the Reasoning Performance of Large Language Models", the authors show how extending the context length impacts an LLM's reasoning performance. I asked myself a similar question a few months ago, and therefore this paper is highly interesting to me.
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models (2402.14848)
This was brought to my attention through a Linkedin post by @ShayeghB, who is also affected:
Ensemble-Based Unsupervised Discontinuous Constituency Parsing by Tree Averaging (2403.00143)
View all the missing papers here:
https://theshayegh.github.io/ACL2024MissingPapers/
v2ray/deepgelbooru
A Danbooru tag image tagger, maybe better than WD14 at some images.
Training code, inference code, dataset included.
:3
https://huggingface.co/v2ray/deepgelbooru
A Danbooru tag image tagger, maybe better than WD14 at some images.
Training code, inference code, dataset included.
:3
https://huggingface.co/v2ray/deepgelbooru
I can't believe this... Phi-3.5-mini (3.8B) running in-browser at ~90 tokens/second on WebGPU w/ Transformers.js and ONNX Runtime Web! ๐คฏ Since everything runs 100% locally, no messages are sent to a server โ a huge win for privacy!
- ๐ค Demo:
webml-community/phi-3.5-webgpu
- ๐งโ๐ป Source code: https://github.com/huggingface/transformers.js-examples/tree/main/phi-3.5-webgpu
- ๐ค Demo:
webml-community/phi-3.5-webgpu
- ๐งโ๐ป Source code: https://github.com/huggingface/transformers.js-examples/tree/main/phi-3.5-webgpu
You can now use DoRA for your embedding layers!
PR: https://github.com/huggingface/peft/pull/2006
I have documented my journey of this specific PR in a blog post for everyone to read. The highlight of the PR was when the first author of DoRA reviewed my code.
Blog Post: https://huggingface.co/blog/ariG23498/peft-dora
Huge thanks to @BenjaminB for all the help I needed.
PR: https://github.com/huggingface/peft/pull/2006
I have documented my journey of this specific PR in a blog post for everyone to read. The highlight of the PR was when the first author of DoRA reviewed my code.
Blog Post: https://huggingface.co/blog/ariG23498/peft-dora
Huge thanks to @BenjaminB for all the help I needed.
๐ Check out the new dataset sourced from Fishki.net, one of the popular entertainment and news portals in the Russian Internet, known for its diverse content including humor, interesting facts, and viral stories -
nyuuzyou/fishkinet-posts
.
๐ Dataset highlights:
- 369,180 posts
- Includes original posts with titles, content, images, and metadata
- Each entry contains URL, title, author, date, tags, content, and image URLs
- Primarily in Russian language
- Covers a wide range of topics in entertainment, news, and social media content
- Spans nearly two decades of posts, likely from early 2000s to 2024
- Dedicated to public domain under Creative Commons Zero (CC0) license
nyuuzyou/fishkinet-posts
.
๐ Dataset highlights:
- 369,180 posts
- Includes original posts with titles, content, images, and metadata
- Each entry contains URL, title, author, date, tags, content, and image URLs
- Primarily in Russian language
- Covers a wide range of topics in entertainment, news, and social media content
- Spans nearly two decades of posts, likely from early 2000s to 2024
- Dedicated to public domain under Creative Commons Zero (CC0) license
Alan Turing's mind-bender: "Can machines think?" in its glorified form. This 74yr old paper laid the foundation for how we think about AI and machine intelligence today. The level of detail, clarity and foresight is just phenomenal - he was way ahead of his time ๐ง ๐ค
Original copy: https://archive.org/details/MIND--COMPUTING-MACHINERY-AND-INTELLIGENCE
Original copy: https://archive.org/details/MIND--COMPUTING-MACHINERY-AND-INTELLIGENCE
g/ - /ldg/ - Local Diffusion General - Technology
>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio
>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio
>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
Introducing HelpingAI2-9B, an emotionally intelligent LLM.
Model Link :
OEvortex/HelpingAI2-9B
Demo Link:
Abhaykoul/HelpingAI2
This model is part of the innovative HelpingAI series and it stands out for its ability to engage users with emotional understanding.
Key Features:
-----------------
* It gets 95.89 score on EQ Bench greather than all top notch LLMs, reflecting advanced emotional recognition.
* It gives responses in empathetic and supportive manner.
Must try our demo:
Abhaykoul/HelpingAI2
Model Link :
OEvortex/HelpingAI2-9B
Demo Link:
Abhaykoul/HelpingAI2
This model is part of the innovative HelpingAI series and it stands out for its ability to engage users with emotional understanding.
Key Features:
-----------------
* It gets 95.89 score on EQ Bench greather than all top notch LLMs, reflecting advanced emotional recognition.
* It gives responses in empathetic and supportive manner.
Must try our demo:
Abhaykoul/HelpingAI2
NEW math-instruct model + dataset!
ValiantLabs/Llama3.1-8B-Cobalt
is our new math-instruct model.
Trained using a synthetic math-instruct dataset generated with Llama 3.1 405b. Find the dataset here:
sequelbox/Polytope
More to come soon :)
ValiantLabs/Llama3.1-8B-Cobalt
is our new math-instruct model.
Trained using a synthetic math-instruct dataset generated with Llama 3.1 405b. Find the dataset here:
sequelbox/Polytope
More to come soon :)
Supercool Weekend Read๐ค
Nvidia researchers achieved SOTA LLM compression metrics using pruning and knowledge distillation techniques.
Details on Techniques (Simplified):
They started off with a large pre-trained language model (15B params), then:
1. Estimated the importance of different parts of the model (neurons, attention heads, layers) using activation-based metrics on a small calibration dataset.
Nvidia researchers achieved SOTA LLM compression metrics using pruning and knowledge distillation techniques.
Details on Techniques (Simplified):
They started off with a large pre-trained language model (15B params), then:
1. Estimated the importance of different parts of the model (neurons, attention heads, layers) using activation-based metrics on a small calibration dataset.
What Happens When RAG System Become Fully Vision-Language Model-Based?
HF Demo:
bokesyo/MiniCPMV-RAG-PDFQA
Multimodal Dense Retriever:
RhapsodyAI/minicpm-visual-embedding-v0
Generation Model:
openbmb/MiniCPM-V-2_6
Github: https://github.com/RhapsodyAILab/MiniCPM-V-Embedding-v0-Train
The Vision-Language Model Dense Retriever MiniCPM-Visual-Embedding-v0 reads PDFs directly -- no OCR required. With strong OCR capability and visual understanding capability, it generates multimodal dense representations, allowing you to build and search through your personal library with ease.
Ask a question, it retrieves the most relevant pages. Then, MiniCPM-V-2.6 provides answers based on the retrieved pages, with strong multi-image understanding capabilities.
Whether youโre working with a visually-intensive or text-oriented PDF, it helps you quickly find the information you need. You can also build a personal library with it.
It operates just like a human: reading, storing, retrieving, and answering with full visual comprehension.
Currently, the online demo supports PDFs with up to 50 pages due to GPU time limits. For longer PDFs or entire books, you can deploy it on your own machine.
https://cdn-uploads.huggingface.co/production/uploads/6415818a986557e8cac252bf/sjtQD7CFgox46h9EVHCG_.png
HF Demo:
bokesyo/MiniCPMV-RAG-PDFQA
Multimodal Dense Retriever:
RhapsodyAI/minicpm-visual-embedding-v0
Generation Model:
openbmb/MiniCPM-V-2_6
Github: https://github.com/RhapsodyAILab/MiniCPM-V-Embedding-v0-Train
The Vision-Language Model Dense Retriever MiniCPM-Visual-Embedding-v0 reads PDFs directly -- no OCR required. With strong OCR capability and visual understanding capability, it generates multimodal dense representations, allowing you to build and search through your personal library with ease.
Ask a question, it retrieves the most relevant pages. Then, MiniCPM-V-2.6 provides answers based on the retrieved pages, with strong multi-image understanding capabilities.
Whether youโre working with a visually-intensive or text-oriented PDF, it helps you quickly find the information you need. You can also build a personal library with it.
It operates just like a human: reading, storing, retrieving, and answering with full visual comprehension.
Currently, the online demo supports PDFs with up to 50 pages due to GPU time limits. For longer PDFs or entire books, you can deploy it on your own machine.
https://cdn-uploads.huggingface.co/production/uploads/6415818a986557e8cac252bf/sjtQD7CFgox46h9EVHCG_.png