Qwen2 Audio - 8.5B, Apache 2.0 licensed Audio Language Models 🔥
> SoTA on ASR, S2TT & AIR-Bench ⚡️
> Used 370K hours of Speech, 140K hours of Music and 10K hours of sound for pre-training
> Model excels at voice chat and audio analysis
> Base + Instruct model checkpoints released
> Uses Whisper Encoder paired with Qwen 2 7B LLM backbone
> Trained with Multi-task fine-tuning
> Followed by SFT & DPO
> Model weights on the Hub
> Integrated with Transformers 🤗
Audio LMs is the direction that I'm quite bullish on, it just makes sense to have something that works right out of the box! Kudos Qwen team! ❤️
> SoTA on ASR, S2TT & AIR-Bench ⚡️
> Used 370K hours of Speech, 140K hours of Music and 10K hours of sound for pre-training
> Model excels at voice chat and audio analysis
> Base + Instruct model checkpoints released
> Uses Whisper Encoder paired with Qwen 2 7B LLM backbone
> Trained with Multi-task fine-tuning
> Followed by SFT & DPO
> Model weights on the Hub
> Integrated with Transformers 🤗
Audio LMs is the direction that I'm quite bullish on, it just makes sense to have something that works right out of the box! Kudos Qwen team! ❤️