Difficulties to deal with HuggingFace transformersHi,I am... | Difficulties to deal with HuggingFace transformersHi,I am...
Difficulties to deal with HuggingFace transformers
Hi,

I am currently working with HuggingFace's transformers library. It is somewhat convenient to load models. I am not a troll. But the deeper I go, the more difficulties arise and I got the impression that the api is not well designed.

It allows for setting the same option at various places, and it is not documented how they interplay. For instance, it seems there is no uniform way to handle special tokens such as EOS. One can set these tokens 1. in the model, 2. in the tokenizer, and 3. in the pipeline. It is unclear to me how exactly these options interplay, and also the documentation does not say anything about it. Sometimes parameters are just ignored, and the library does not warn you about it. For instance, the parameter "add_eos_token" of the tokenizer seems to have no effect in some cases, and I am not the only one with this issue (https://github.com/huggingface/transformers/issues/30947).

It seems that it strongly depends on the model where and how you actually configure options, what effects they will have, or which settings work at all. This somehow contrasts the purpose of the api. It wants to make it easy to switch from one model to another, giving the impression that everything is controlled by just the model id. But when you go deeper it turns out that many small things have to be tailored to the model (even if restricted to a certain class such as generative text LLM). A look into the sourcecode of the transformers library confirms that it makes distinctions depending on the model id. That is, internally the library seems to exploit knowledge about the different models. That's not what one expects from a platform that pretends to work with arbitrary models.

Anyone having thoughts like this?