๐ Installation and set-up
We need the following ๐ค Hugging Face libraries:
transformers contains an API for training models and many pre-trained models
tokenizers is automatically installed by transformers and "tokenize" our data (ie it converts text to sequence of numbers)
datasets contains a rich source of data and common metrics, perfect for prototyping
We also install wandb to automatically instrument our training.
[ ]
!pip install datasets wandb evaluate accelerate -qU
!wget https://raw.githubusercontent.com/huggingface/transformers/master/examples/pytorch/text-classification/run_glue.py
[ ]
# the run_glue.py script requires transformers dev
!pip install -q git+https://github.com/huggingface/transformers
We finally make sure we're logged into W&B so that our experiments can be associated to our account.
[ ]
import wandb
[ ]
wandb.login()
๐ก Configuration tips
W&B integration with Hugging Face can be configured to add extra functionalities:
auto-logging of models as artifacts: just set environment varilable WANDB_LOG_MODEL to true
log histograms of gradients and parameters: by default gradients are logged, you can also log parameters by setting environment variable WANDB_WATCH to all
set custom run names with run_name arg present in scripts or as part of TrainingArguments
organize runs by project with the WANDB_PROJECT environment variable
For more details refer to W&B + HF integration documentation.
Let's log every trained model.
We need the following ๐ค Hugging Face libraries:
transformers contains an API for training models and many pre-trained models
tokenizers is automatically installed by transformers and "tokenize" our data (ie it converts text to sequence of numbers)
datasets contains a rich source of data and common metrics, perfect for prototyping
We also install wandb to automatically instrument our training.
[ ]
!pip install datasets wandb evaluate accelerate -qU
!wget https://raw.githubusercontent.com/huggingface/transformers/master/examples/pytorch/text-classification/run_glue.py
[ ]
# the run_glue.py script requires transformers dev
!pip install -q git+https://github.com/huggingface/transformers
We finally make sure we're logged into W&B so that our experiments can be associated to our account.
[ ]
import wandb
[ ]
wandb.login()
๐ก Configuration tips
W&B integration with Hugging Face can be configured to add extra functionalities:
auto-logging of models as artifacts: just set environment varilable WANDB_LOG_MODEL to true
log histograms of gradients and parameters: by default gradients are logged, you can also log parameters by setting environment variable WANDB_WATCH to all
set custom run names with run_name arg present in scripts or as part of TrainingArguments
organize runs by project with the WANDB_PROJECT environment variable
For more details refer to W&B + HF integration documentation.
Let's log every trained model.