Share and discover more about AI with social posts from the community.huggingface/OpenAi
๐Ÿ“ข The Three-hop (๐Ÿ’กaspect + ๐Ÿค”opinion + ๐Ÿง reason) Chain-of-Thought concept + LLM represent a decent concept for reasoning emotions of participants in textual dialogues.
Delighted to share the tutorial video which make you aware of:
โœ… The proper application of LLM towards implicit IR
โœ… Ways for aligning different information types (causes and states) within the same LLM
โœ… Launch your LLM in GoogleColab that is capable for characters Emotion Extraction in dialogues ๐Ÿงช

๐ŸŽฅ: https://www.youtube.com/watch?v=vRVDQa7vfkU

Project: https://github.com/nicolay-r/THOR-ECAC
Paper: https://aclanthology.org/2024.semeval-1.4/
Model card:
nicolay-r/flan-t5-emotion-cause-thor-base
The Romulus model series has been released on Hugging Face, continually pre-trained on 34,864,949 tokens of French laws and intended to serve as a foundation for fine-tuning on labeled data ๐Ÿค—

The training code, dataset and model weights are open and available free on HF and the training was based on H100 provided by Microsoft for Startups using Unsloth AI by @danielhanchen and @shimmyshimmer ๐Ÿฆฅ

Link to the base model:
louisbrulenaudet/Romulus-cpt-Llama-3.1-8B-v0.1


Link to the instruct model:
louisbrulenaudet/Romulus-cpt-Llama-3.1-8B-v0.1-Instruct


Link to the dataset:
louisbrulenaudet/Romulus-cpt-fr


Please note that these models have not been aligned for the production of usable texts as they stand, and will certainly need to be refined for the desired tasks in order to produce satisfactory results.https://cdn-uploads.huggingface.co/production/uploads/6459fa0f5b3111fbe83286e1/n_KKbhGEDZg-2NMBu3OGo.jpeg
> ๐—ช๐—ฎ๐—ป๐˜ ๐˜๐—ผ ๐—ธ๐—ป๐—ผ๐˜„ ๐—ต๐—ผ๐˜„ ๐—บ๐˜‚๐—ฐ๐—ต ๐—ฎ๐—ป ๐—”๐—ฃ๐—œ ๐—Ÿ๐—Ÿ๐—  ๐—ฐ๐—ฎ๐—น๐—น ๐—ฐ๐—ผ๐˜€๐˜๐˜€ ๐˜†๐—ผ๐˜‚?

I've just made this Space that gets you the API price for any LLM call, for nearly all inference providers out there!

This is based on a comment by @victor under my HF Post a few months back, and leverages BerriAI's data for LLM prices.

Check it out here ๐Ÿ‘‰
m-ric/text_to_dollars
Auto-regressive LMs have ruled, but encoder-based architectures like GLiNER are proving to be just as powerful for information extraction while offering better efficiency and interpretability. ๐Ÿ”โœจ

Past encoder backbones were limited by small pre-training datasets and old techniques, but with innovations like LLM2Vec, we've transformed decoders into high-performing encoders! ๐Ÿ”„๐Ÿ’ก

Whatโ€™s New?
๐Ÿ”นConverted Llama & Qwen decoders to advanced encoders
๐Ÿ”นImproved GLiNER architecture to be able to work with rotary positional encoding
๐Ÿ”นNew GLiNER (zero-shot NER) & GLiClass (zero-shot classification) models

๐Ÿ”ฅ Check it out:

New models:
knowledgator/llm2encoder-66d1c76e3c8270397efc5b5e


GLiNER package: https://github.com/urchade/GLiNER

GLiClass package: https://github.com/Knowledgator/GLiClass

๐Ÿ’ป Read our blog for more insights, and stay tuned for whatโ€™s next!
https://medium.com/@knowledgrator/llm2encoders-e7d90b9f5966 GitHub - urchade/GLiNER: Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @โ€ฆ
Free research tip:
Get used to writing the first draft of your paper in markdown using vscodeโ€™s jupyter notebook extension - it lets you do quick sanity checks with code and maths - an absolute AAA experience:)
made an image similarity demo to test out the
mistral-community/pixtral-12b-240910
model .

If anyone knows how to generate captions with it , please do let me know x ๐Ÿš€

here's the demo :
Tonic/Pixtral


hope you like it ๐Ÿค—
What if we asked the AI what it thought of our hugging face profile? ๐Ÿ‘น
I've released a new space capable of doing it.... watch out, it hits hard! ๐ŸฅŠ

Try it now โžก๏ธ
enzostvs/hugger-roaster


Share your roast below ๐Ÿ‘‡
If you are interested in deep reinforcement learning, find below my ICML paper on how we can detect adversaries in deep reinforcement learning:

Paper: Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions
Link: https://proceedings.mlr.press/v202/korkmaz23a.html
๐—”๐—ฟ๐—ฐ๐—ฒ๐—ฒ ๐—ฟ๐—ฒ๐—น๐—ฒ๐—ฎ๐˜€๐—ฒ๐˜€ ๐—ฆ๐˜‚๐—ฝ๐—ฒ๐—ฟ๐—ก๐—ผ๐˜ƒ๐—ฎ, ๐—ฏ๐—ฒ๐˜๐˜๐—ฒ๐—ฟ ๐—ณ๐—ถ๐—ป๐—ฒ-๐˜๐˜‚๐—ป๐—ฒ ๐—ผ๐—ณ ๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ-๐Ÿฏ.๐Ÿญ-๐Ÿณ๐Ÿฌ๐—•!

2๏ธโƒฃ versions: 70B and 8B
๐Ÿง  Trained by distilling logits from Llama-3.1-405B
๐Ÿฅ Used a clever compression method to reduce dataset weight from 2.9 Petabytes down to 50GB (may share it in a paper)
โš™๏ธ Not all benchmarks are improved: GPQA and MUSR go down a slight bit
๐Ÿค— 8B weights are available on HF (not the 70B)

Read their blog post ๐Ÿ‘‰ https://blog.arcee.ai/arcee-supernova-training-pipeline-and-model-composition/
Model weights (8B) ๐Ÿ‘‰
arcee-ai/Llama-3.1-SuperNova-Lite Arcee-SuperNova: Training Pipeline and Model Composition
๐Ÿš€ Sentence Transformers v3.1 is out! Featuring a hard negatives mining utility to get better models out of your data, a new strong loss function, training with streaming datasets, custom modules, bug fixes, small additions and docs changes. Here's the details:

โ› Hard Negatives Mining Utility: Hard negatives are texts that are rather similar to some anchor text (e.g. a question), but are not the correct match. They're difficult for a model to distinguish from the correct answer, often resulting in a stronger model after training.
๐Ÿ“‰ New loss function: This loss function works very well for symmetric tasks (e.g. clustering, classification, finding similar texts/paraphrases) and a bit less so for asymmetric tasks (e.g. question-answer retrieval).
๐Ÿ’พ Streaming datasets: You can now train with the datasets.IterableDataset, which doesn't require downloading the full dataset to disk before training. As simple as "streaming=True" in your "datasets.load_dataset".
๐Ÿงฉ Custom Modules: Model authors can now customize a lot more of the components that make up Sentence Transformer models, allowing for a lot more flexibility (e.g. multi-modal, model-specific quirks, etc.)
โœจ New arguments to several methods: encode_multi_process gets a progress bar, push_to_hub can now be done to different branches, and CrossEncoders can be downloaded to specific cache directories.
๐Ÿ› Bug fixes: Too many to name here, check out the release notes!
๐Ÿ“ Documentation: A particular focus on clarifying the batch samplers in the Package Reference this release.

Check out the full release notes here โญ๏ธ: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.1.0

I'm very excited to hear your feedback, and I'm looking forward to the future changes that I have planned, such as ONNX inference! I'm also open to suggestions for new features: feel free to send me your ideas. Release v3.1.0 - Hard Negatives Mining utility; new loss function for symmetric tasks; streaming datasets; custom modules ยท UKPLab/sentenceโ€ฆ
Please check the Open Source AI Network: we mapped the top 500 HF users
based on their followers' profiles.

The map can be found here:
bunkalab/mapping_the_OS_community
Finally tried Kotaemon, an open-source RAG tool for document chat!

With local models, it's free and private. Perfect for journalists and researchers.

I put Kotaemon to the test with EPA's Greenhouse Gas Inventory. Accurately answered questions on CO2 percentage in 2022 emissions and compared 2022 vs 2021 data

๐Ÿ›  Kotaemon's no-code interface makes it user-friendly.
- Use your own models or APIs from OpenAI or Cohere
- Great documentation & easy installation
- Multimodal capabilities + reranking
- View sources, navigate docs & create graphRAG

๐ŸŒŸ Kotaemon is gaining traction with 11.3k GitHub stars

Try the online demo:
cin-model/kotaemon-demo

GitHub: https://github.com/Cinnamon/kotaemon
Docs: https://cinnamon.github.io/kotaemon/usage/ GitHub - Cinnamon/kotaemon at dailydev
Whisper is a state-of-the-art model for automatic speech recognition (ASR) and speech translation, proposed in the paper Robust Speech Recognition via Large-Scale Weak Supervision by Alec Radford et al. from OpenAI. Trained on >5M hours of labeled data, Whisper demonstrates a strong ability to generalise to many datasets and domains in a zero-shot setting.

Whisper large-v3 has the same architecture as the previous large and large-v2 models, except for the following minor differences:

The spectrogram input uses 128 Mel frequency bins instead of 80
A new language token for Cantonese
The Whisper large-v3 model was trained on 1 million hours of weakly labeled audio and 4 million hours of pseudo-labeled audio collected using Whisper large-v2 . The model was trained for 2.0 epochs over this mixture dataset.

The large-v3 model shows improved performance over a wide variety of languages, showing 10% to 20% reduction of errors compared to Whisper large-v2 . For more details on the different checkpoints available, refer to the section Model details.

Disclaimer: Content for this model card has partly been written by the ๐Ÿค— Hugging Face team, and partly copied and pasted from the original model card.
MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models.

Compared to MiniCPM1.0/MiniCPM2.0, MiniCPM3-4B has a more powerful and versatile skill set to enable more general usage. MiniCPM3-4B supports function call, along with code interpreter. Please refer to Advanced Features for usage guidelines.

MiniCPM3-4B has a 32k context window. Equipped with LLMxMapReduce, MiniCPM3-4B can handle infinite context theoretically, without requiring huge amount of memory.
FLUX.1 [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post.

Key Features
Cutting-edge output quality, second only to our state-of-the-art model FLUX.1 [pro].
Competitive prompt following, matching the performance of closed source alternatives .
Trained using guidance distillation, making FLUX.1 [dev] more efficient.
Open weights to drive new scientific research, and empower artists to develop innovative workflows.
Generated outputs can be used for personal, scientific, and commercial purposes as described in the FLUX.1 [dev] Non-Commercial License.
When the three AI Godfathers join hands to write a paper you know itโ€™s nothing short of classic genius! This was an excellent read, I hope they write one on Generative AI.

Read: https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf
๐ŸŽ“ Introducing the ะบะพะฝัะฟะตะบั‚ั‹-ัƒั€ะพะบะพะฒ.ั€ั„ Lesson Plans Dataset -
nyuuzyou/classnotes


Dataset highlights:
- Metadata for 65,068 lesson plans from ะบะพะฝัะฟะตะบั‚ั‹-ัƒั€ะพะบะพะฒ.ั€ั„
- 58,433 lesson plans available in original format
- Multilingual content: Primarily Russian, with some Kazakh, Ukrainian, Belarusian, and English
- Each entry includes: URL, title, description, author, publication date, file size, and download link
- Data reflects educational materials accessible through the ะบะพะฝัะฟะตะบั‚ั‹-ัƒั€ะพะบะพะฒ.ั€ั„ platform
- Licensed under Creative Commons (https://creativecommons.org/licenses/by-nc/3.0/deed.en)

This dataset offers a unique window into online educational resources, particularly in Russian-language contexts. It provides opportunities for analyzing lesson plan trends, topic distributions, and language patterns in educational materials. The dataset is particularly well-suited for tasks such as text classification and text retrieval in multilingual educational settings.
> Article read: Simple guide to LLM inference and to TGI

I've just read article "LLM inference at scale with TGI" by @martinigoyanes . It's really good content, a must-read if you want a good low-level intro to LLM inference with TGI!

My takeaways:

How does inference work?
๐Ÿง  Prefill: the input prompt is tokenized on CPU, then transferred to GPU. Then one single forward pass generates the initial token.
๐Ÿ”„ Decode: the model generates ("decodes") tokens one by one, each time appending the new token to the current input of size N to then generate a new token again with this augmented input of length N+1. This loop ends either when a specific token called "End-of-sequence" is generated or when the completion reaches a pre-specified maximum length. Then the sequence is de-tokenized on CPU to yield text again.
โฑ This step's speed determines the Time Per Output Token, which directly translates to the key metric: Throughput

๐Ÿค” How was the separation between the two steps decided ? Like, why does prefill include this strange generation of only one token at then end?
โžก๏ธ The cost of attention scales quadratically with the number of tokens, so it can really explode quickly.
To compensate for that, a really important technique called KV caching was devised: using the fact that when generating token N+1, the Key and Value (K and V) matrices generated inside the Transformers are a simple extension from the K and V from the previous step, the model caches the K and V matrices between steps : thus the separation - the prefill part is the part that prepares this KV cache, while the decoding is the one that leverages it and expands it by one at each step.

TGI-specific takeaways:
โš™๏ธ TGI has many SOTA techniques for decoding: Paged Attention, KV Caching and Flash Attentionโ€ฆ
๐Ÿ”€ TGI's router handles generations finishing early because of an EOS token: instead of static batching, it continuously batches requests to the inference engine & filters away finished requests. https://cdn-uploads.huggingface.co/production/uploads/63d10d4e8eaa4831005e92b5/8_CFLfbkMRDWj8QkgTcRh.png
Help me to upgrade my model.

Hi all, so I am a complete beginner in coding, however, with the help of Claude (similar to Matt :P) and GPT 4o have been able to develop this RAG PDF summarizer/Q&A plus a web search tool.

The application is specifically built for summarization task including summarizing a financial document, news article, resume, research document, call transcript, etc.

The space could be found here:
Shreyas094/SearchGPT


The news tool simply use duckduckgo chat to generate the search results using llama 3.1 70bn model.

I want your support to fine tune the retrieval task for handling more unstructured documents.