Share and discover more about AI with social posts from the community.huggingface/OpenAi
# Excited to Share: New LLM Tokenization - Convert Text to tokens and vice versa! 🚀

I've just developed a powerful tool for anyone working with Language Models (LLMs) or diving into Natural Language Processing (NLP).

🔍 Introducing the LLM Tokenization - Convert Text to tokens and vice versa!!

Key Features:
- Convert text to tokens and token IDs
- Reverse engineer: convert token IDs back to text
- Support for popular models: LLama3 (Will add more models iteratively)
- User-friendly Gradio interface for easy interaction

Whether you're debugging your NLP pipeline, exploring how different models tokenize text, or just curious about the inner workings of LLMs, this tool is for you!

👩‍💻 Tech Stack:
- Python
- Gradio for the web interface
- Hugging Face Transformers for tokenization

The application is deployed in Hugging Face spaces as Gradio application

🔗 Try it out: https://lnkd.in/g6R5z9k2

#NLP #MachineLearning #AI #PythonDevelopment #OpenSourceAI
Excited to share my new Gradio app featuring the impressive Llama-3.1-Storm-8B model!
This app demonstrates the capabilities of Llama-3.1-Storm-8B, an 8B parameter language model created by Ashvini Kumar Jindal, Pawan Kumar Rajpoot, Ankur Parikh,@akjindal53244
Key highlights of Llama-3.1-Storm-8B:

Outperforms Llama-3.1-8B-Instruct on multiple benchmarks:

Instruction Following (IFEval): +3.93%
Knowledge-driven QA (GPQA): +7.21%
Reduced Hallucinations (TruthfulQA): +9%
Function Calling (BFCL): +7.92%


Achieves impressive results with only 8B parameters
Uses innovative techniques like self-curation and model merging

Try out the model yourself:
sagar007/lama_storm_8b


Kudos to the creators for pushing the boundaries of smaller language models! This work makes advanced AI more accessible and efficient.
#AI #NLP #MachineLearning #GradioApp #Llama3
📐 AI Math Equation Solver: Your Step-by-Step Solution Companion

Hello Hugging Face community! 👋 I'm excited to share my latest Space: the AI Math Equation Solver!

🔍 What does it do?

This Space uses the power of AI to solve math problems from images. Simply upload a picture of a math equation or problem, and the AI will provide a detailed, step-by-step solution. It's perfect for students, teachers, or anyone looking to understand complex mathematical concepts better.

🧠 How does it work?

- Backend: Utilizes the microsoft/Phi-3.5-vision-instruct model for image understanding and mathematical reasoning.
- Frontend: Built with Gradio for a clean, user-friendly interface.
- Features:
- Image upload for math problems
- Detailed step-by-step solutions
- Example problems to try instantly

🚀 Try it out!
sagar007/phi-vision-math-assistant


Visit the Space here: [Insert your Hugging Face Space URL]

💡 Use cases:

- Students: Check your work or get help with homework
- Teachers: Create detailed solution guides quickly
- Tutors: Explain complex problems more effectively
- Self-learners: Understand new mathematical concepts

🛠 Technical Details:

- Model: microsoft/Phi-3.5-vision-instruct
- Libraries: transformers, Gradio, PyTorch
- Optimizations: Uses Flash Attention for improved performance

🤝 Contribute:

This is an open project, and I welcome contributions! Whether it's improving the model, enhancing the UI, or adding new features, feel free to fork the project and submit your pull requests.

📣 Feedback:

I'd love to hear your thoughts! How are you using this Space? Any suggestions for improvements? Let me know in the comments below.

Happy problem-solving! 🎉

#MachineLearning #AI #Mathematics #Education #HuggingFace
AI unicorn Hugging Face acquires XetHub to manage huge AI models, aiming to host hundreds of millions. Meta's Llama 3.1 has 405B parameters, driving the need for more scalable solutions. XetHub's tools for efficient data management will integrate into Hugging Face's platform. #AI
alvdansen
/
flux-koda
#StableDiffusion #Flux #AI #ComfyUI
Model description
Koda captures the nostalgic essence of early 1990s photography, evoking memories of disposable cameras and carefree travels. It specializes in creating images with a distinct vintage quality, characterized by slightly washed-out colors, soft focus, and the occasional light leak or film grain. The model excels at producing slice-of-life scenes that feel spontaneous and candid, as if plucked from a family photo album or a backpacker's travel diary.

Words that can highlight interesting nuances within the model:

kodachrome, blurry, realistic, still life, depth of field, scenery, no humans, monochrome, greyscale, traditional media, horizon, looking at viewer, light particles, shadow

https://cdn-uploads.huggingface.co/production/uploads/635dd6cd4fabde0df74aeae6/7CqMzFOlH6yoM-NpQdYDs.png
Getty Images Partners with NVIDIA to Upgrade AI Image Generation Tool: Generate 4 Images in 6 Seconds

Global image repository giant Getty Images has partnered with tech titan NVIDIA to introduce a cutting-edge AI image generation tool. This is no ordinary upgrade; it represents a significant leap in speed, quality, and accuracy!

The new AI model can generate four images in approximately 6 seconds, doubling the speed of its predecessor! Imagine pressing the shutter and, in the blink of an eye, four high-definition beautiful images appear before you—the speed is almost unbelievable.

Key Points:

🚀 Ultra-fast Experience: New AI model generates 4 images in 6 seconds, doubling the speed!

🎨 Quality Leap: Adopts NVIDIA Edify architecture, significantly improving image quality and output speed.

🛠 Unlimited Creativity: Introduces AI image modification features, allowing for one-click element changes, canvas expansion, and more creative freedom.

These are the exciting upgrades brought by Getty Images and NVIDIA's collaboration on AI image generation tools. Let's look forward to how it will transform our creative world!

#GettyImages#NVIDIA#AI Image Generation#Midjourney
Context Caching with Gemini 1.5 Flash
Google recently released a new feature called context-caching which is available via the Gemini APIs through the Gemini 1.5 Pro and Gemini 1.5 Flash models. This guide provides a basic example of how to use context-caching with Gemini 1.5 Flash.


https://youtu.be/987Pd89EDPs?si=j43isgNb0uwH5AeI

The Use Case: Analyzing a Year's Worth of ML Papers
The guide demonstrates how you can use context caching to analyze the summaries of all the ML papers we've documented over the past year. We store these summaries in a text file, which can now be fed to the Gemini 1.5 Flash model and query efficiently.
https://www.promptingguide.ai/applications/context-caching
Zero-Shot Prompting

Large language models (LLMs) today, such as GPT-3.5 Turbo, GPT-4, and Claude 3, are tuned to follow instructions and are trained on large amounts of data. Large-scale training makes these models capable of performing some tasks in a "zero-shot" manner. Zero-shot prompting means that the prompt used to interact with the model won't contain examples or demonstrations. The zero-shot prompt directly instructs the model to perform a task without any additional examples to steer it.
https://youtu.be/ZTaHqdkxUMs

We tried a few zero-shot examples in the previous section. Here is one of the examples (ie., text classification) we used:

Prompt:

Classify the text into neutral, negative or positive.
Text: I think the vacation is okay.
Sentiment:

Output:

Neutral

Note that in the prompt above we didn't provide the model with any examples of text alongside their classifications, the LLM already understands "sentiment" -- that's the zero-shot capabilities at work.

Instruction tuning has been shown to improve zero-shot learning Wei et al. (2022). Instruction tuning is essentially the concept of finetuning models on datasets described via instructions. Furthermore, RLHF (reinforcement learning from human feedback) has been adopted to scale instruction tuning wherein the model is aligned to better fit human preferences. This recent development powers models like ChatGPT. We will discuss all these approaches and methods in upcoming sections.

When zero-shot doesn't work, it's recommended to provide demonstrations or examples in the prompt which leads to few-shot prompting. In the next section, we demonstrate few-shot prompting.
It is driven by a motor, with a height of 5 feet 6 inches and a weight of 70 kilograms.
Software and intelligence aspect:
It is equipped with an on-board visual language model (VLM), enabling it to perform rapid common-sense visual reasoning.
Compared to the previous generation product, the on-board computing and AI reasoning capabilities have tripled, allowing many real-world AI tasks to be executed completely independently.
It is equipped with a specially customized speech-to-speech reasoning model from the company's investor OpenAI. The default UI is speech, and it communicates with humans through the on-board microphone and speaker. #AI
Figure 02 is the second-generation humanoid robot released by the artificial intelligence robot startup company Figure. The following is some information about it and an analysis of its possible impact on the job market, etc.:
Characteristics and capabilities of Figure 02:
Hardware aspect:
The appearance adopts an exoskeleton structure, integrating the power supply and computing power wiring inside the body, improving reliability and packaging compactness.
It is equipped with a fourth-generation hand device, with 16 degrees of freedom and strength comparable to that of humans, capable of carrying up to 25 kilograms of weight, and can flexibly perform various human-like tasks.
It has 6 RGB cameras (located on the head, chest and back respectively), and has "superhuman" vision.
The internal battery pack capacity has increased to 2.25 kWh. Its founder hopes that it can achieve an actual effective working time of more than 20 hours per day (but currently the official website shows that the battery life is only 5 hours. The 20 hours might be the inferred limit working time of "charging + working").
It is driven by a motor, with a height of 5 feet 6 inches and a weight of 70 kilograms.
Software and intelligence aspect:
It is equipped with an on-board visual language model (VLM), enabling it to perform rapid common-sense visual reasoning.
Compared to the previous generation product, the on-board computing and AI reasoning capabilities have tripled, allowing many real-world AI tasks to be executed completely independently.
It is equipped with a specially customized speech-to-speech reasoning model from the company's investor OpenAI. The default UI is speech, and it communicates with humans through the on-board microphone and speaker. #AI
Kling AI Video is FINALLY Public (All Countries), Free to Use and MIND BLOWING - Full Tutorial > https://youtu.be/zcpqAxYV1_w

You probably seen those mind blowing AI made videos. And the day has arrived. The famous Kling AI is now worldwide available for free. In this tutorial video I will show you how to register for free with just email to Kling AI and use its mind blowing text to video animation, image to video animation and text to image, and image to image capabilities. This video will show you non-cherry pick results so you will know the actual quality and capability of the model unlike those extremely cherry pick example demos. Still, #KlingAI is the only #AI model that competes with OpenAI's #SORA and it is real to use.

🔗 Kling AI Official Website ⤵️
▶️ https://www.klingai.com/



🔗 Our GitHub Repository ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion
Porting Vision-Language Models to Apple Silicon with MLX: A Tutorial Series

Are you interested in running cutting-edge AI models efficiently on your Mac? We're excited to share a detailed tutorial series on porting Phi-3-Vision to Apple's MLX framework!

This 8-part series covers:

1. Basic Implementation: Translating core components from PyTorch to MLX
2. Su-scaled Rotary Position Embeddings (SuRoPE): Enabling 128K token contexts
3. Batching: Processing multiple inputs simultaneously for improved efficiency
4. Caching: Optimizing inference speed for autoregressive generation
5. Choice Selection: Implementing constrained outputs for multiple-choice scenarios
6. Constrained Decoding: Guiding model outputs with flexible constraints
7. LoRA Training: Fine-tuning models efficiently with Low-Rank Adaptation
8. Agent & Toolchain System: Building flexible AI workflows

Whether you're an AI enthusiast, researcher, or developer looking to leverage Apple Silicon, this series provides a deep dive into optimizing advanced vision-language models. You'll learn hands-on techniques for model porting, performance optimization, and extending model capabilities.

Check out the full series for a comprehensive guide to running state-of-the-art AI on your Mac!

Link to the tutorial series:

https://medium.com/@albersj66

All the code examples and implementations discussed in this tutorial series are available in our GitHub repository:

https://github.com/JosefAlbers/Phi-3-Vision-MLX

This repository contains:
- Full implementation of Phi-3-Vision in MLX
- Step-by-step code for each tutorial part
- Additional utilities and helper functions

We encourage you to explore the code, experiment with it, and contribute to the project. Your feedback and contributions are welcome!

#MachineLearning #AppleSilicon #MLX #VisionLanguageModels #AI #OpenSource #GitHub #AITutorial