Pre-Training BERT with Hugging Face Transformers and Habana Gaudi
In this Tutorial, you will learn how to pre-train BERT-base from scratch using a Habana Gaudi-based DL1 instance on AWS to take advantage of the cost-performance benefits of Gaudi. We will use the Hugging Face Transformers, Optimum Habana and Datasets libraries to pre-train a BERT-base model using masked-language modeling, one of the two original BERT pre-training tasks. Before we get started, we need to set up the deep learning environment.
View Code
You will learn how to:
Prepare the dataset
Train a Tokenizer
Preprocess the dataset
Pre-train BERT on Habana Gaudi
Note: Steps 1 to 3 can/should be run on a different instance size since those are CPU intensive tasks.
In this Tutorial, you will learn how to pre-train BERT-base from scratch using a Habana Gaudi-based DL1 instance on AWS to take advantage of the cost-performance benefits of Gaudi. We will use the Hugging Face Transformers, Optimum Habana and Datasets libraries to pre-train a BERT-base model using masked-language modeling, one of the two original BERT pre-training tasks. Before we get started, we need to set up the deep learning environment.
View Code
You will learn how to:
Prepare the dataset
Train a Tokenizer
Preprocess the dataset
Pre-train BERT on Habana Gaudi
Note: Steps 1 to 3 can/should be run on a different instance size since those are CPU intensive tasks.