HF-hub - Share and discover more about AI with social posts from the community.huggingface/OpenAi
Share and discover more about AI with social posts from the community.huggingface/OpenAi
𝗚𝗼𝗼𝗴𝗹𝗲 𝗽𝗮𝗽𝗲𝗿 : 𝘀𝗰𝗮𝗹𝗶𝗻𝗴 𝘂𝗽 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗰𝗼𝗺𝗽𝘂𝘁𝗲 𝗯𝗲𝗮𝘁𝘀 𝟭𝟰𝘅 𝗹𝗮𝗿𝗴𝗲𝗿 𝗺𝗼𝗱𝗲𝗹𝘀 🚀

Remember scaling laws? These are empirical laws that say "the bigger your model, the better it gets". More precisely, "as your compute increases exponentially, loss decreases in a linear fashion". They have wild implications, suggesting that spending 100x more training compute would make you super-LLMs. That's why companies are racing to build the biggest AI superclusters ever, and Meta bought 350k H100 GPUs, which probably cost in the order of $1B.

But think of this : we're building huge reasoning machines, but only ask them to do one pass through the mod
🚀 Meet the new GLiNER architecture 🚀
GLiNER revolutionized zero-shot NER by demonstrating that lightweight encoders can achieve excellent results. We're excited to continue R&D with this spirit 🔥. Our new bi-encoder and poly-encoder architectures were developed to address the main limitations of the original GLiNER architecture and bring the following new possibilities:

🔹 An unlimited number of entities can be recognized at once.
🔹Faster inference when entity embeddings are preprocessed.
🔹Better generalization to unseen entities.
'Legal Dictionary GPT' is now completely trained and ready for Open Source release to the world! Trained on 10,000 rows of legal definitions, Legal Dictionary GPT is your go-to resource for everything related to the first step in understanding the law, defining it. The model is free and publicly available for anyone to use.

Model Link: https://platform.openai.com/playground/chat?preset=eCrKdaPe9cnMnyTETqWDCQAU

Knowledge Base Bots are internal facing as opposed to external facing LLM models, that are either fine tuned or RAG tuned, generally on systems and processes related data. OpenAI Platform
BIG update dropped for
bigdata-pw/Flickr
- now ~515M images! Target for the next update: 1B

In case you missed them; other recent drops include
bigdata-pw/Dinosaurs
- a small set of BIG creatures 🦕🦖 and the first in a series of articles about the art of web scraping! https://huggingface.co/blog/hlky/web-scraping-101 https://huggingface.co/blog/hlky/web-scraping-102

Stay tuned for exciting datasets and models coming soon:
- PC and Console game screenshots
- TV/Film actors biographies and photos (thin Web Scraping 101
We are proud to release our latest suite of three image(s)-to-3D Gradio demos and two new papers.

SpaRP (Unposed sparse views to 3D):
sudo-ai/SpaRP

SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views (2408.10195)

MeshFormer (@minghua @NCJ ):
sudo-ai/MeshFormer

MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model (2408.10198)

MeshLRM-reproduced (@sarahwei0210 ):
sudo-ai/MeshLRM
https://huggingface.co/spaces/sudo-ai/MeshLRM MeshLRM (Unofficial) - a Hugging Face Space by sudo-ai
Cooked up a cool & much faster AI voice assistant space that also supports speech translation (with seamless-expressive). Start with the phrase "Please translate" followed by the speech you'd like to translate, to activate speech translation mode. Using opensource LLMs (Llama 3, Mistral etc) with edge tts for voice assistant and seamless-expressive for speech translation.

Give it a try:
Jaward/optimus
https://huggingface.co/spaces/Jaward/optimus Optimus - a Hugging Face Space by Jaward
Woman.ru Forum Posts Dataset -
nyuuzyou/womanru-posts


📊 Dataset highlights:

- 1,308,238 forum posts extracted from Woman.ru
- Includes original posts and replies from various threads
- Each entry contains URL, title, original post, date, and replies
- Primarily in Russian language
The Minimalist Spaces That May Be Helpful !!
Grab Doc | Type Byte | SD3 CLI

- Grab Doc:
prithivMLmods/GRAB-DOC

- Type Byte:
prithivMLmods/Type-Byte

- SD3 CLI:
prithivMLmods/SD3-CLI
Falcon Mamba now available now in llama.cpp !
Check out GGUF files uploaded here:
tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a
This isn’t a goal of ours because we have plenty of money in the bank but quite excited to see that @huggingfaceis profitable these days, with 220 team members and most of our platform being free (like model hosting) and open-source for the community!

Especially noteworthy at a time when most AI startups wouldn’t survive a year or two without VC money. Yay!
Calling all Hugging Face users! We want to hear from YOU!

What feature or improvement would make the biggest impact on Hugging Face?

Whether it's the Hub, better documentation, new integrations, or something completely different – we're all ears!

Your feedback shapes the future of Hugging Face. Drop your ideas in the comments below! 👇
NEW TASK ALERT 🚨
Extractive Question Answering: because sometimes generative is not all you need 😉
AutoTrain is the only open-source, no code solution to offer so many tasks across different modalities. Current task count: 23 🚀
Check out the blog post on getting started with this task: https://huggingface.co/blog/abhishek/extractive-qa-autotrain Extractive Question Answering with AutoTrain
BIG update dropped for
bigdata-pw/Flickr
- now ~515M images! Target for the next update: 1B

In case you missed them; other recent drops include
bigdata-pw/Dinosaurs
- a small set of BIG creatures 🦕🦖 and the first in a series of articles about the art of web scraping! https://huggingface.co/blog/hlky/web-scraping-101 https://huggingface.co/blog/hlky/web-scraping-102

Stay tuned for exciting datasets and models coming soon:
- PC and Console game screenshots Web Scraping 101
Just dropped a fresh version of dataset-viber along with some cool, Gradio-based annotators! These tools aren't about formalities—they're here to help you quickly collect feedback and get your projects moving along to a more serious stage, ahumm @argilla.

Some new features!
- manual import from a CSV or the Hugging Face Hub
- manual export to CSV or the Hub
- improved automated export to the Hub and CSV
When On-Premise is Better than the Cloud

During my time at Palantir, I have spent significant time deploying our software in cloud environments and also a good chunk of time deploying our software in on-premise (on-prem) environments (including starting a team doing just that). I have noticed that despite the common preference for cloud deployment, there are still merits to deploying on-prem.

The Shift from On-Prem to Cloud Computing

Over recent years, the IT landscape has increasingly favored cloud computing, driven by the flexibility of Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) offerings. The global cloud computing market grew from $24.63 billion in 2010 to $156.4 billion in 2020, and that trend continues and is predicted to surpass $1 trillion by 2028. This meteoric rise is powered by both new demand of compute by the world, but also migration of on-prem workflows to the cloud.


There are good reasons for this shift, the cloud enables rapid provisioning of resources, geographic redundancy, and a shift from capital expenditures (CapEx) to operational expenditures (OpEx). However, I believe that there are certain scenarios where the use of on-prem infrastructure, particularly where specific technical requirements, such as deterministic latency, hardware-level control, and stringent security measures, are paramount.
Access and Distribution:
Think of vaccine distribution like a pizza delivery service—except the stakes are much higher, and there are way more “no-delivery zones.” Low-income and remote areas often struggle to get vaccines where they’re needed most. It’s like ordering a pizza to the middle of the Sahara—tricky and sounds impossible, right?

Vaccine Hesitancy:
Ah, the kryptonite of public health. Despite overwhelming evidence, some folks are still hesitant to roll up their sleeves. Whether it’s misinformation on social media or just plain fear of needles, vaccine hesitancy is a real buzzkill. I hope you remember the times when UNICEF used all the tricks held up its sleeves to build confidence in COVID-19 vaccines in the Indian subcontinent.
A Quick Flashback into Traditional Immunization
Before AI swooped in to save the day, immunization was a bit like the first car our dads got us —effective but sometimes slow and prone to breakdowns. Traditional immunization practices have done wonders, but they’ve also hit a few speed bumps along the way. Remember when getting vaccines to remote areas was about as tricky as finding a parking spot at Coachella? Yeah, that was one of the major challenges. And despite the lifesaving potential, vaccines haven’t always had it easy in the PR department—thanks to the ever-persistent issue of vaccine hesitancy.

Numbers Don’t Lie - Know What’s at Stake
Vaccination rates have improved over the years, but the global picture is still a mixed bag. According to a report by UNICEF, around 23 million children missed out on basic vaccines in 2020—a sharp rise from previous years. While organizations like UNICEF and the CDC are working tirelessly to improve these stats, challenges like distribution, hesitancy, and data management keep throwing wrenches into the works.


But do these challenges continue to be real dealbreakers globally?
Role of AI in Immunization: The Shots that Your Body Needs
Alright, folks, let’s talk about something that’s saved more lives than the Avengers combined—immunization. Vaccines have been the unsung heroes in the battle against some of the world’s nastiest villains (Mpox, please don’t bring back the 2020 phase once again). But in this world of snaps and reels, even our trusty vaccines are getting a digital makeover with AI. It’s more than just modern treatment methods; what if I tell you an AI-driven platform predicted the widespread of COVID-19 before anyone else?


Let me explain it, along with the major role AI plays in the current scenario, so sit tight.






This National Immunization Awareness Month, we will look at how the brainiest sidekick is stepping up to the plate in the world of immunization as it is becoming the Tony Stark of the healthcare world—brilliant, innovative, and just a little bit cooler than everyone else.
Earning potential
According to recent analysis of 342 salaries, a Rust developer in the U.S. makes, on average, $156,000 a year. While the majority of experienced Rustaceans can earn close to $200,000 annually, entry-level positions begin at $121,875 per year – not too shabby.


These figures from job titles including Rust compare well with more general software developer job titles. For example, software engineers command $123,594, system engineers $115,184, and developers $112,502.


Regionally speaking, Texas and New York both offer the highest salaries to Rust developers at $187,500, followed by Georgia ($175,00) and California ($150,000).


Ready to find your next role in tech? Whether you’re a Rust expert or novice or simply want to put your coding expertise to good use, visit the Hackernoon Job Board today
Companies using Rust
Rust is becoming more and more popular among businesses of all sizes, due to its distinctive qualities, but this is especially true for safety-critical projects. Its wide range of applications includes network programming, web development, and system programming.


In addition, there is a growing need for the system language in the fields of app development, blockchain, Internet of Things, and smart contract programming.


Discord, for instance, accelerates its system by utilizing the low-level language. The chat platform's speed increased tenfold after converting to Rust.


The programming language was used by Meta to make changes to the internal source code management software that its engineers utilize. Dropbox synchronizes files between user devices and its cloud storage via the system language.


Rust is a key part of Microsoft and Amazon’s future, while the U.S. government is even advising to lessen "vulnerabilities at scale," programmers should convert to memory-safe languages like Rust.