Introducing the Hugging Face LLM Inference Container for... | Introducing the Hugging Face LLM Inference Container for...
Introducing the Hugging Face LLM Inference Container for Amazon SageMaker
This is an example on how to deploy the open-source LLMs, like BLOOM to Amazon SageMaker for inference using the new Hugging Face LLM Inference Container. We will deploy the 12B Pythia Open Assistant Model, an open-source Chat LLM trained with the Open Assistant dataset.

The example covers:

Setup development environment
Retrieve the new Hugging Face LLM DLC
Deploy Open Assistant 12B to Amazon SageMaker
Run inference and chat with our model
Create Gradio Chatbot backed by Amazon SageMaker
You can find the code for the example also in the notebooks repository.https://github.com/huggingface/blog/blob/main/sagemaker-huggingface-llm.md blog/sagemaker-huggingface-llm.md at main · huggingface/blog