Introducing the Hugging Face Embedding Container for Amaz... | Introducing the Hugging Face Embedding Container for Amaz...
Introducing the Hugging Face Embedding Container for Amazon SageMaker
We are excited to announce that the new Hugging Face Embedding Container for Amazon SageMaker is now generally available (GA). AWS customers can now efficiently deploy embedding models on SageMaker to build Generative AI applications, including Retrieval-Augmented Generation (RAG) applications.

In this Blog we will show you how to deploy open Embedding Models, like Snowflake/snowflake-arctic-embed-l, BAAI/bge-large-en-v1.5 or sentence-transformers/all-MiniLM-L6-v2 to Amazon SageMaker for inference using the new Hugging Face Embedding Container. We will deploy the Snowflake/snowflake-arctic-embed-m-v1.5 one of the best open Embedding Models for retrieval - you can check its rankings on the MTEB Leaderboard.

The example covers:

1. Setup development environment
2. Retrieve the new Hugging Face Embedding Container
3. Deploy Snowflake Arctic to Amazon SageMaker
4. Run and evaluate Inference performance
5. Delete model and endpoint
https://github.com/huggingface/blog/blob/main/sagemaker-huggingface-embedding.md blog/sagemaker-huggingface-embedding.md at main · huggingface/blog