Share and discover more about AI with social posts from the community.huggingface/OpenAi
We’re taking OpenAI DevDay on the road! Join us this fall in San Francisco, London, or Singapore for hands-on sessions, demos, and best practices. Meet our engineers and see how developers around the world are building with OpenAI.

openai.com/devday/
We’re starting to roll out advanced Voice Mode to a small group of ChatGPT Plus users. Advanced Voice Mode offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions.https://x.com/i/status/1818353580279316863
The ChatGPT desktop app for macOS is now available for all users.

Get faster access to ChatGPT to chat about email, screenshots, and anything on your screen with the Option + Space shortcut: https://openai.com/chatgpt/mac/The ChatGPT desktop app for macOS is now available for all users.

Get faster access to ChatGPT to chat about email, screenshots, and anything on your screen with the Option + Space shortcut: https://openai.com/chatgpt/mac/

The desktop app for macOS now gives you side-by-side access to ChatGPT. Use Option + Space to open a companion window, which stays in front so you can use it more easily when working with other apps.
CodiumAI PR-Agent is an open-source tool that assists developers in streamlining pull-request creation and review. It automatically analyzes the PR and can provide several types of feedback, including Auto-Description, PR Review, Q&A, Code Suggestion, and more.
some cool things you can build with supabase realtime

http://supabase.com/realtime
Meta releases SAM2 segmentation model

Last week, Meta continued to make efforts in the field of images and released the Meta Segment Anything Model2 (SAM2) image segmentation model.
It is used for real-time, promptable image and video object segmentation, achieving a leap in video segmentation experience and enabling seamless use between image and video applications.
SM2 surpasses previous capabilities in image segmentation accuracy and achieves better video segmentation performance than existing works, while requiring one-third of the interaction time.
SM2 can also segment any object in any video or image (often described as O-shot generalization), which means it can be applied to previously unseen visual content without custom adaptation.
Also released is SAV: the largest video segmentation dataset, the SA-V dataset contains an order of magnitude more annotations, and the number of videos in the video object segmentation dataset is about 4.5 times that of existing datasets.
The main features of S-V are: more than 600,000 mask annotations on approximately 51,000 videos. Videos showing geographical diversity and real scenes, collected from 47 countries. Covers annotations for whole objects, object parts, and challenging situations, such as objects being occluded, disappearing, and reappearing.
This demo is outrageous, SAM2 can stably track and segment a person from a very blurry, very detailed aerial video.
Download the model here: https:/github.com/facebookresearch/segment-anything-2
Experience SAM2 here: https:/sam2.metademolab.com/
Google releases Gemma 2 2B and Gemini 1.5 Pro
Google also started to exert its strength last week, and released Gemini 1.5 Pro and Gemma 2 2B models successively.
Among them, Gemini 1.5 Pro 0801 surpassed GPT-4o mini in the overall ranking of LLM Arena and became the first. Google said that this is an experimental version
It is not yet an official version, so it is only available in AI Studio.
However, from the test, Gemini 1.5 Pro 0801's multimodal capabilities are very powerful, basically surpassing GPT-4o and Claude 3.5, and it supports audio
and video. I tried it with a podcast file of more than an hour, and it was summarized in more than ten seconds.
In addition, Google also released Gemma 2 2B, a model that can run on the device side. This model also scored higher in the LLM Arena than a number of LLMs that are much larger than it.
This is the quantified running effect of Gemma 2 2B plus MLX on iPhone 15pro.
And this model also has built-in Google's newly released security classifier ShieldGemma, which can effectively detect hate speech, harassment, sexually suggestive content and dangerous content.
First release on the entire network: Zhipu "Sora" is now open source
Zhipu's CogVideoX, which is the homologous model of "Qingying", was open sourced when this article was published.
The model has been uploaded to Github and Hugging Face.
The model can be run and adjusted on a single A6000 card.
Generated resolution 720 * 480, 6 seconds, 48 ​​frames.
The training data comes from the Internet, and B Station also provides technical support.
*Note 1: Inference is 21.6G, peak is 36G, and fine-tuning is stable at 46.2G, which is within the range of A6000.

*Note 2: The inference optimization has just been updated, and the peak is now 18G, which can be run on a single 4090 card.
📫 AI in the news today: Struggling AI startups, Figure 0 robot, chipmakers

- OpenAI Co-Founders Schulman and Brockman Step Back
https://finance.yahoo.com/news/openai-co-founders-schulman-brockman-010542796.html

- Struggling AI Startups Look for a Bailout from Big Tech
"More exits—either pseudo-acquisitions or real ones—are coming, investors say, as a bubble built by the excitement around generative AI is showing signs of peaking."
https://www.wsj.com/tech/ai/struggling-ai-startups-look-for-a-bailout-from-big-tech-3e635927?mod=rss_Technology

- Did Google Just Pay $2.5 Billion to Hire Character's CEO?
https://www.theinformation.com/articles/did-google-just-pay-2-5-billion-to-hire-characters-ceo

- Figure’s new humanoid robot leverages OpenAI for natural speech conversations
Figure has unveiled its latest humanoid robot, the Figure 02.
The most notable addition this time out arrives by way a longstanding partnership with OpenAI, which helped Figure raise a $675 million Series B back in February, valuing the South Bay firm at $2.6 billion.
https://techcrunch.com/2024/08/06/figures-new-humanoid-robot-leverages-openai-for-natural-speech-conversations/

- World’s Five Leading Chipmakers Have Now Promised U.S. Investment
The Biden administration award up to $450 million in grants to a South Korean chipmaker, SK Hynix, to help build its new chip facility in Indiana
The US now has commitments from all five of the world’s leading-edge semiconductor manufacturers to construct chip plants in theUS with financial assistance from the administration
https://www.nytimes.com/2024/08/06/business/economy/chipmakers-promise-investment.html OpenAI Co-Founders Schulman and Brockman Step Back
Idefics3-Llama is out! 💥💥
Model:
HuggingFaceM4/Idefics3-8B-Llama3

Demo:
HuggingFaceM4/idefics3


It's a multimodal model based on Llama 3.1 that accepts an arbitrary number of interleaved images with text with a huge context window (10k tokens!)

Supported by Hugging Face transformers 🤗
Zapier - Automate Like a Pro-Today’s Sponsor
Struggling to keep up with the rapid pace of technology?

ZapConnect 2024 is your gateway to the future of productivity, offering a free half-day virtual event packed with automation insights and tech-savvy strategies.

Learn from 42 automation experts, including Zapier's CEO

Choose from 30 innovative sessions on AI, marketing, and more

Network with 8,000+ fellow productivity enthusiasts in 60+ virtual lounges

Upgrade your tech skills and turn intimidating innovations into powerful allies.

Claim your spot at ZapConnect 2024 and future-proof your productivity!

Check out ZapConnect 2024
🤖How DatoCMS solved tag invalidation with Turso-(https://turso.tech/blog/how-datocms-solved-tag-invalidation-with-turso)

🪐Turso is SOC2 Type II compliant with zero issues (https://turso.tech/blog/turso-achieves-soc2-compliance)

📗Generate/store OpenAI Vector Embeddings in Turso tur.so/QfcmnJq

🔥Save Resend email events to your Turso database tur.so/eVLsNdi

➳ANN search with DiskANN in libSQL tur.so/2TBNQFv

Set your teams free with Turso Embedded Replicas tur.so/4GsGAPJ

🏅An environmentally friendly GitHub Actions trick tur.so/QKQQ0H8

📜 Here's why you should try .NET + libSQL tur.so/3OQueK0
AI Tools of the week
🔌 Thunderbit - Deploy AI apps and automations with this no-code solutions for web assistance, summarization, and task automation.

🎓 CourseGenie - Accelerate course creation with auto-generated descriptions, outlines, activities, and assessments.

💡 TextMine - Analyze, manage, and smart-search thousands of documents so that you can make better decisions quickly.

🚀 AI SEO by Leap - Supercharge your website with intelligent SEO tools for content creation, keyword research, and optimization.

🎓 inncivio - Change the way you and your team learn with AI-powered and personalized education.

🖋 Humanizar Texto - Convert AI-generated text into natural, engaging writing that evades detection and plagiarism checks.

📄 MyReport - Automate report and essay creation with AI-powered data collection, writing, and source citation.

🧑‍💻 Modelfuse - Build no-code AI workflows combining your own data with text, image, video, and audio LLMs from top providers.

💬 PSY - AI Therapists - Access confidential mental health support 24/7 through multilingual AI therapist chats.

AI PDF Summarizer(https://theresanaiforthat.com/ai/ai-pdf-summarizer) - Convert lengthy PDFs into digestible insights, with multilingual summaries and document chat features.

📸 Executive Headshots (https://theresanaiforthat.com/ai/executive-headshots)- Transform everyday photos into polished, professional headshots for your digital presence.

💼 Aion(https://theresanaiforthat.com/ai/aion/) - Your AI co-pilot for navigating CEO-level company management, strategy, goal-setting, and market analysis.

📷 AI Cam Lens(https://theresanaiforthat.com/ai/ai-cam-lens) - Instantly analyze your surroundings and get answers to visual questions using your phone's camera.

🎥 AI Image To Video Generator(https://theresanaiforthat.com/ai/ai-image-to-video-generator) - Breathe life into static images by transforming them into captivating videos for social media. AI PDF Summarizer And 2 Other AI Alternatives For Pdf summaries
OpenAI's new models gpt-4o-2024-08-06 and gpt-4o-mini begin to support structured output. Unlike the previous JSON mode, the current structured output can not only define the JSON Schema, but also ensure that the output JSON is 100% correct.

Some limitations:

1. Only a part of JSON Schema is allowed: String, Number, Boolean, Object, Array, Enum, anyOf, oneOf and allOf are not supported, which is enough for normal use

2. All fields are required and cannot be optional

3. The nesting cannot exceed 5 levels and cannot exceed 100 attributes

4. Some reserved words cannot be used as attribute names, such as string types cannot use minLength, maxLength, etc.

5. The first API response with a new Schema will generate additional delays, which will be cached later. The general delay will not exceed 10 seconds, but complex Schemas may require up to a minute of preprocessing time

6. Structured output does not prevent all types of model errors. For example, the model may still make mistakes in the values ​​of the JSON object (e.g., a wrong step in a math equation). If errors are made, it is recommended to provide examples in the prompt words or split the task into simpler subtasks.

Original article: https://openai.com/index/introducing-structured-outputs-in-the-api/ Introducing Structured Outputs in the API
HubSpot - Turn AI Into Your Personal Assistant
Ready to master the art of AI delegation?

Learn how to turn AI into your personal productivity powerhouse with HubSpot’s highly anticipated AI Task Delegation Playbook. It’s time to optimize your workflow like never before!

Unlock time-saving efficiencies.

Elevate your overall productivity.

Seamlessly streamline your workflow.

Get ready to save time and boost efficiency with easy-to-use templates and calculators.

Don’t miss out—grab your copy of HubSpot’s Playbook and start transforming your workday!

Check out HubSpot’s Playbook—-https://offers.hubspot.com/ai-delegation Free AI Task Delegation Playbook [Download Now]
Breaking News
The latest developments in AI

🚀 Google - A new tiny AI model, Gemma 2 2B, has been released challenging tech giants and even outperforming many of them. Alongside Gemma 2 2B, Google released ShieldGemma, a suite of safety content classifiers, and Gemma Scope, a model interpretability tool.

🎙 OpenAI - ChatGPT's new advanced voice mode impresses early users with diverse capabilities. Demos show it telling stories as an airline pilot with appropriate audio effects. While some accents aren't perfectly native, it handles interruptions well and can laugh or cry during conversations.

🎥 Runway - Gen-3 Alpha Turbo, a faster version of Runway's AI video model, has been unveiled. It's claimed to be 7x faster than the original while maintaining quality. The move is aimed at reducing costs, encouraging increased usage, and staying competitive in the AI video generation market.
Access expert-level advice about any topic

Step 1:

First, head over to Claude AI.

Note: You can use ChatGPT if you prefer.

Once you’re there, keep reading…

Step 2:

Next, you’ll need to create an account and/or log in.

Start a new chat and you’re ready to go!

One of the most common techniques for getting expert-like advice from AI is by asking it to “Act like [insert expert]”.

But how do you know who the right expert is?

The first thing to do is to give the chatbot a little background on what it’ll be helping you with, then add the final sentence as seen in this prompt:

I'm a German citizen living in the UK and working for a US company. Since I'm not an employee of the US company I work for and only freelance for them, I'm unsure how I should file my taxes and also how I would handle VAT. List the expert professionals best suited to deal with this issue.
You should get a list of 5-10 experts who would best be able to provide advice for your question.