๐Ÿ›  Installation and set-upWe need the following ๐Ÿค— Huggin... | ๐Ÿ›  Installation and set-upWe need the following ๐Ÿค— Huggin...
๐Ÿ›  Installation and set-up
We need the following ๐Ÿค— Hugging Face libraries:

transformers contains an API for training models and many pre-trained models
tokenizers is automatically installed by transformers and "tokenize" our data (ie it converts text to sequence of numbers)
datasets contains a rich source of data and common metrics, perfect for prototyping
We also install wandb to automatically instrument our training.


[ ]
!pip install datasets wandb evaluate accelerate -qU
!wget https://raw.githubusercontent.com/huggingface/transformers/master/examples/pytorch/text-classification/run_glue.py

[ ]
# the run_glue.py script requires transformers dev
!pip install -q git+https://github.com/huggingface/transformers
We finally make sure we're logged into W&B so that our experiments can be associated to our account.


[ ]
import wandb


[ ]
wandb.login()
๐Ÿ’ก Configuration tips
W&B integration with Hugging Face can be configured to add extra functionalities:

auto-logging of models as artifacts: just set environment varilable WANDB_LOG_MODEL to true
log histograms of gradients and parameters: by default gradients are logged, you can also log parameters by setting environment variable WANDB_WATCH to all
set custom run names with run_name arg present in scripts or as part of TrainingArguments
organize runs by project with the WANDB_PROJECT environment variable
For more details refer to W&B + HF integration documentation.

Let's log every trained model.