transformers
Auto-generated from huggingface/transformers by Mutable.ai Auto WikiRevise
transformers | |
---|---|
GitHub Repository | |
Developer | huggingface |
Written in | Python |
Stars | 124k |
Watchers | 1.1k |
Created | 10/29/2018 |
Last updated | 04/03/2024 |
License | Apache License 2.0 |
Homepage | huggingface.co/transformers |
Repository | huggingface/transformers |
Auto Wiki | |
Revision | |
Software Version | 0.0.8Basic |
Generated from | Commit 863e25 |
Generated at | 04/04/2024 |
The Transformers repository is a state-of-the-art machine learning library that provides a comprehensive set of tools and utilities for working with natural language processing (NLP), computer vision, audio, and multimodal tasks. The library is designed to simplify the process of leveraging pre-trained models and fine-tuning them for a wide range of applications.
At the core of the Transformers library is the ability to automatically select and instantiate the appropriate model, configuration, tokenizer, and other components based on the provided information. The …/auto
directory contains the implementation of the "Auto" classes, such as AutoModel
, AutoTokenizer
, and AutoConfig
, which handle this automatic selection and instantiation process. This abstraction allows users to easily work with a variety of pre-trained models without needing to know the specific implementation details of each component.
The Transformers library also provides a flexible and extensible framework for text generation, as demonstrated by the …/generation
directory. This directory contains the core functionality for applying various techniques and constraints during the generation process, including beam search, logits processing, stopping criteria, and assisted generation. The GenerationConfig
class is the central component for configuring the text generation process, offering a wide range of parameters to control the output.
Another key aspect of the Transformers library is its support for integrating with various third-party libraries and tools. The …/integrations
directory contains functionality for enabling the use of different quantization techniques, hardware acceleration, and reporting/monitoring capabilities within the Transformers ecosystem. For example, the aqlm.py
module provides a function to replace the Linear layers in a PyTorch model with AQLM (Adaptive Quantization for Linear Modules) quantized layers, while the deepspeed.py
module integrates the Transformers library with the DeepSpeed deep learning optimization library.
The Transformers library also includes a comprehensive set of example scripts and utilities that showcase its capabilities across a wide range of natural language processing and machine learning tasks. These examples, located in the examples
directory, cover a variety of topics, such as Flax examples, PyTorch examples, and TensorFlow examples, as well as research project examples.
Overall, the Transformers repository provides a powerful and flexible framework for working with state-of-the-art machine learning models, making it easier for developers to leverage the latest advancements in natural language processing, computer vision, and other domains.
Preprocessing and TokenizationRevise
References: transformers
The notebooks
directory contains a collection of Jupyter notebooks that showcase the functionality and usage of the Transformers library. These notebooks cover a wide range of applications, including natural language processing, computer vision, audio processing, and biological sequence analysis.
Fine-Tuning ModelsRevise
References: transformers
The notebooks
directory contains a collection of Jupyter notebooks that showcase the functionality and usage of the Transformers library. These notebooks cover a wide range of applications, including natural language processing, computer vision, audio processing, and biological sequence analysis.
Transformers AgentsRevise
References: transformers
The Transformers Agents API is a key component of the Transformers library, allowing users to create and share custom tools for agents to use. The core functionality is provided by the Agent
and Tool
classes.
Model ImplementationsRevise
References: src/transformers/models
The …/models
directory contains the implementations of various pre-trained language models and their associated components, such as configurations, tokenizers, and modeling classes. This directory serves as a central hub for managing the complexity of working with a wide range of pre-trained models and their associated components, making it easier for users to leverage the capabilities of the Transformers library.
Auto FactoryRevise
References: src/transformers/models/auto/auto_factory.py
The auto_factory.py
file in the Transformers library provides the core functionality for the auto-model factory, which is responsible for determining the appropriate model class to instantiate based on the provided configuration.
Configuration AutoRevise
The configuration_auto.py
file in the Transformers library provides the AutoConfig
class, which is responsible for automatically loading and instantiating the appropriate configuration class for a given pre-trained model.
BARTRevise
References: src/transformers/models/bart/configuration_bart.py
, src/transformers/models/bart/tokenization_bart.py
, src/transformers/models/bart/tokenization_bart_fast.py
, src/transformers/models/bart/modeling_tf_bart.py
, src/transformers/models/bart/modeling_flax_bart.py
The transformers/src/transformers/models/bart/configuration_bart.py
file contains the BartConfig
class, which is used to store and manage the configuration parameters of the BART (Bidirectional and Auto-Regressive Transformers) model. This class inherits from PretrainedConfig
and defines various configuration parameters, such as the vocabulary size, model dimensions, number of layers, attention heads, and dropout rates. The BartOnnxConfig
class is also defined in this file, which is used to configure the BART model for ONNX (Open Neural Network Exchange) export and inference.
BERTRevise
References: src/transformers/models/bert/configuration_bert.py
, src/transformers/models/bert/tokenization_bert.py
, src/transformers/models/bert/tokenization_bert_tf.py
, src/transformers/models/bert/modeling_bert.py
, src/transformers/models/bert/modeling_tf_bert.py
, src/transformers/models/bert/modeling_flax_bert.py
The transformers/src/transformers/models/bert/configuration_bert.py
file defines the BertConfig
class, which is used to store the configuration of a BERT model and instantiate a BERT model with the specified arguments. The BertConfig
class inherits from the PretrainedConfig
class and takes several arguments that define the architecture and hyperparameters of the BERT model, such as the size of the vocabulary, the dimensionality of the hidden layers, the number of attention heads, the activation function, the dropout rates, and the type of position embeddings.
BART-JapaneseRevise
References: src/transformers/models/barthez/tokenization_barthez.py
, src/transformers/models/barthez/tokenization_barthez_fast.py
The BarthezTokenizer
and BarthezTokenizerFast
classes in …/tokenization_barthez.py
and …/tokenization_barthez_fast.py
respectively, provide the tokenization functionality for the BART-Japanese model.
BEiTRevise
References: src/transformers/models/beit/configuration_beit.py
, src/transformers/models/beit/feature_extraction_beit.py
, src/transformers/models/beit/image_processing_beit.py
, src/transformers/models/beit/modeling_beit.py
, src/transformers/models/beit/modeling_flax_beit.py
The transformers/src/transformers/models/beit/configuration_beit.py
file contains the configuration class for the BEiT (Bidirectional Encoder Representation from Transformers) model. The BeitConfig
class is used to store the configuration of a BeitModel
and to instantiate the model with the specified arguments, defining the model architecture. The file also includes the BeitOnnxConfig
class, which is used for ONNX (Open Neural Network Exchange) configuration of the BEiT model.
BigBirdRevise
References: src/transformers/models/big_bird/configuration_big_bird.py
, src/transformers/models/big_bird/tokenization_big_bird.py
, src/transformers/models/big_bird/tokenization_big_bird_fast.py
The BigBirdConfig
class in …/configuration_big_bird.py
is the main configuration class for the BigBird model. It inherits from the PretrainedConfig
class and allows users to customize various aspects of the BigBird model, such as the vocabulary size, hidden size, number of attention heads, and attention type.
Specialized Kernels and OperationsRevise
References: src/transformers/kernels
The Transformers library provides a set of highly optimized and specialized kernels and operations that are critical for the efficient execution of Transformer-based models on GPU hardware. These kernels and operations leverage the parallel processing capabilities of GPUs to accelerate various computations used in Transformer-based models.
Multi-Scale Deformable AttentionRevise
The multi-scale deformable attention mechanism is a key component of the Deformable DETR object detection model. This mechanism allows the model to attend to relevant features at different scales, which is important for detecting objects of varying sizes in an image.
YOSO (Your Own Self-Attention)Revise
References: src/transformers/kernels/yoso
The YOSO (Your Own Self-Attention) module in the Transformers library focuses on efficient Locality Sensitive Hashing (LSH) based computations, which are crucial for the performance and scalability of the YOSO transformer model.
Miscellaneous Kernels and OperationsRevise
References: src/transformers/kernels/mra
The transformers/src/transformers/kernels/mra
directory contains several CUDA kernel functions that are used for various operations in the Transformers library. These kernels include:
Weighted Key-Value (WKV) OperationRevise
References: src/transformers/kernels/rwkv
The …/wkv_op.cpp
file contains the CUDA-based implementation of the Weighted Key-Value (WKV) operation, which is a key component of the Recurrent Weighted Kernel (RWKV) model.
Integration with Third-Party LibrariesRevise
References: src/transformers/integrations
The …/integrations
directory provides functionality for integrating the Transformers library with various third-party libraries and tools, enabling the use of quantization, hardware acceleration, and reporting/monitoring capabilities within the Transformers ecosystem.
AQLM IntegrationRevise
References: src/transformers/integrations/aqlm.py
The AQLM (Adaptive Quantization for Linear Modules) integration provides functionality to replace the nn.Linear
layers in a PyTorch model with AQLM-quantized layers. This can be used to reduce the model size and improve inference performance.
AWQ IntegrationRevise
References: src/transformers/integrations/awq.py
The AWQ (Adaptive Weight Quantization) integration offers functions to replace Linear layers with AWQ-quantized layers, fuse certain modules to improve inference performance, and handle post-initialization steps for Exllama models.
Bitsandbytes IntegrationRevise
References: src/transformers/integrations/bitsandbytes.py
The Bitsandbytes integration includes functions to replace Linear and Conv1D layers with 8-bit or 4-bit quantized layers from the Bitsandbytes library, and utilities to manage quantized tensors.
DeepSpeed IntegrationRevise
References: src/transformers/integrations/deepspeed.py
The DeepSpeed integration provides classes and functions to integrate the Transformers library with the DeepSpeed deep learning optimization library, enabling the use of DeepSpeed's features such as ZeRO-3 and mixed-precision training.
Integration UtilitiesRevise
The Integration Utilities module defines a set of utility functions and callback classes to enable integration with various machine learning reporting and hyperparameter optimization tools, such as TensorBoard, Weights & Biases, Optuna, and more.
PEFT IntegrationRevise
References: src/transformers/integrations/peft.py
The PEFT (Parameter-Efficient Fine-Tuning) integration includes a mixin class PeftAdapterMixin
that allows loading, training, and using PEFT adapters in Transformer-based models. The mixin supports various PEFT methods, such as Low Rank Adapters (LoRA), IA3, and AdaLora.
Quanto IntegrationRevise
References: src/transformers/integrations/quanto.py
The Quanto integration provides a function to replace Linear and LayerNorm layers in a PyTorch model with Quanto-quantized layers, enabling efficient quantization of the model.
TPU IntegrationRevise
References: src/transformers/integrations/tpu.py
The TPU integration contains a function to handle the integration of Transformer models with Tensor Processing Units (TPUs) using the PyTorch XLA library, enabling efficient data loading and processing on TPUs.
Example Scripts and UtilitiesRevise
References: examples
The examples
directory contains a comprehensive set of example scripts and utilities that showcase the capabilities of the Transformers library across a wide range of natural language processing and machine learning tasks.
Flax ExamplesRevise
References: examples/flax
The …/flax
directory contains a comprehensive set of example scripts and utilities that showcase the capabilities of the Transformers library using the JAX/Flax backend. The examples cover a wide range of natural language processing and speech recognition tasks, including:
Language ModelingRevise
References: examples/flax/language-modeling
The language modeling examples cover the pretraining and fine-tuning of various Transformer-based language models, including Masked Language Modeling (MLM), Causal Language Modeling (CLM), Span-Masked Language Modeling (T5-like), and Denoising Language Modeling (BART).
Question AnsweringRevise
References: examples/flax/question-answering
The question answering examples demonstrate how to fine-tune a Transformer-based model for question-answering tasks using the Flax library.
Speech RecognitionRevise
References: examples/flax/speech-recognition
The speech recognition examples show how to fine-tune Flax-based speech recognition models, including the Whisper model from OpenAI.
SummarizationRevise
References: examples/flax/summarization
The summarization examples showcase how to fine-tune Transformer-based models for text summarization tasks using the Flax library.
Text ClassificationRevise
References: examples/flax/text-classification
The text classification examples in the …/text-classification
directory demonstrate how to fine-tune Transformer models on text classification tasks from the GLUE benchmark using the Flax library.
Token ClassificationRevise
References: examples/flax/token-classification
The token classification examples in the Transformers library demonstrate how to fine-tune Transformer-based models on token classification tasks, such as Named Entity Recognition (NER), using the Flax library.
VisionRevise
References: examples/flax/vision
The vision examples demonstrate how to fine-tune a Vision Transformer (ViT) model for image classification using the Flax library.
PyTorch ExamplesRevise
References: examples/pytorch
The …/pytorch
directory contains a comprehensive set of example scripts and utilities that showcase the capabilities of the Transformers library using the PyTorch backend, covering a wide range of natural language processing, computer vision, and speech recognition tasks.
Audio ClassificationRevise
References: examples/pytorch/audio-classification
The audio classification examples demonstrate how to fine-tune the Wav2Vec2
model for audio classification tasks using PyTorch. The key functionality is provided in the run_audio_classification.py
script, which handles the following:
Contrastive Image-TextRevise
References: examples/pytorch/contrastive-image-text
The Contrastive Image-Text examples in the Transformers repository demonstrate how to train a CLIP-like vision-text dual encoder model using pre-trained vision and text encoders. This model can be used for natural language image search and potentially zero-shot image classification.
Image ClassificationRevise
References: examples/pytorch/image-classification
The image classification examples in the Transformers library demonstrate how to fine-tune various image classification models using PyTorch. The two main scripts in this directory are:
Image PretrainingRevise
References: examples/pytorch/image-pretraining
The image pretraining examples contain scripts and examples for pre-training Transformer-based vision models, such as Vision Transformer (ViT) and Swin Transformer, on custom image data using self-supervised learning techniques.
Language ModelingRevise
References: examples/pytorch/language-modeling
The language modeling examples provide examples for fine-tuning and training various Transformer-based language models, such as GPT, GPT-2, ALBERT, BERT, DistilBERT, RoBERTa, and XLNet, on text datasets.
Multiple ChoiceRevise
References: examples/pytorch/multiple-choice
The multiple choice examples demonstrate how to fine-tune pre-trained Transformer-based models on multiple-choice tasks, specifically using the SWAG (Situations With Adversarial Generations) dataset.
Question AnsweringRevise
References: examples/pytorch/question-answering
The question answering examples show how to fine-tune Transformer models on question-answering (QA) datasets, such as SQuAD.
Semantic SegmentationRevise
References: examples/pytorch/semantic-segmentation
The semantic segmentation examples contain scripts for fine-tuning Transformer-based models for semantic segmentation tasks. The two main scripts are run_semantic_segmentation.py
and run_semantic_segmentation_no_trainer.py
, which demonstrate different approaches to fine-tuning the models.
Speech PretrainingRevise
References: examples/pytorch/speech-pretraining
The speech pretraining example includes a script for pretraining a Wav2Vec2 model on unlabeled audio data using the Wav2Vec2 contrastive loss objective.
Speech RecognitionRevise
References: examples/pytorch/speech-recognition
The speech recognition examples demonstrate how to fine-tune various Transformer-based models for automatic speech recognition (ASR) tasks. The examples cover three main approaches: Connectionist Temporal Classification (CTC), CTC with Adapter Layers, and Sequence-to-Sequence (Seq2Seq) models.
SummarizationRevise
References: examples/pytorch/summarization
The summarization examples contain scripts and examples for fine-tuning and evaluating Transformer-based models on text summarization tasks. The main functionality is provided by two scripts:
Text ClassificationRevise
References: examples/pytorch/text-classification
The text classification examples include scripts for fine-tuning pre-trained Transformer models on various text classification tasks, including the GLUE benchmark, single/multi-label classification, and the XNLI task.
Text GenerationRevise
References: examples/pytorch/text-generation
The text generation examples provide functionality related to conditional text generation using various auto-regressive models, such as GPT, GPT-2, GPT-J, Transformer-XL, XLNet, CTRL, BLOOM, LLAMA, and OPT.
Token ClassificationRevise
References: examples/pytorch/token-classification
The token classification examples contain scripts and utilities for fine-tuning pre-trained Transformer models on token classification tasks, such as Named Entity Recognition (NER), Parts-of-speech tagging (POS), or phrase extraction (CHUNKS).
TranslationRevise
References: examples/pytorch/translation
The translation examples include scripts and examples for fine-tuning and evaluating various Transformer-based models on translation tasks. The main functionality is provided by the run_translation.py
script, which uses the Seq2SeqTrainer
from the Transformers library to handle the training and evaluation process.
TensorFlow ExamplesRevise
References: examples/tensorflow
The …/
directory contains a collection of example scripts and utilities that demonstrate how to use the Transformers library for various natural language processing tasks in a TensorFlow environment.
BenchmarkingRevise
References: examples/tensorflow/benchmarking
The benchmarking examples provide functionality for running performance benchmarks of the Transformers library in a TensorFlow environment, including plotting the results.
Contrastive Image-TextRevise
References: examples/tensorflow/contrastive-image-text
The Contrastive Image-Text examples in the Transformers library showcase how to fine-tune a CLIP-like vision-text dual encoder model on the COCO dataset using TensorFlow.
Image ClassificationRevise
References: examples/tensorflow/image-classification
The …/image-classification
directory contains a script that demonstrates how to fine-tune a Vision Transformer (ViT) model on the "beans" dataset from the Hugging Face Hub using TensorFlow.
Language ModelingRevise
References: examples/tensorflow/language-modeling
The language modeling examples in the …/language-modeling
directory demonstrate how to pre-train and fine-tune Transformer-based language models using the Hugging Face Transformers library in TensorFlow. The examples cover two main types of language models: masked language models (MLMs) and causal language models (CLMs).
Language Modeling (TPU)Revise
References: examples/tensorflow/language-modeling-tpu
The language modeling (TPU) examples showcase how to train Transformer-based language models on TPUs using TensorFlow, including dataset preprocessing and tokenizer training.
Multiple ChoiceRevise
References: examples/tensorflow/multiple-choice
The multiple choice examples fine-tune pre-trained models on the SWAG (Situations With Adversarial Generations) multiple-choice dataset using TensorFlow.
Question AnsweringRevise
References: examples/tensorflow/question-answering
The question answering examples in the Transformers library demonstrate how to fine-tune pre-trained Transformer models on question-answering datasets like SQuAD using TensorFlow.
SummarizationRevise
References: examples/tensorflow/summarization
The summarization examples in the …/summarization
directory demonstrate how to fine-tune Transformer-based models for text summarization tasks using the Hugging Face Transformers library and the TensorFlow framework.
Text ClassificationRevise
References: examples/tensorflow/text-classification
The text classification examples in the Transformers library demonstrate how to fine-tune pre-trained Transformer-based models for text classification tasks using TensorFlow. The two main scripts in this directory are run_text_classification.py
and run_glue.py
.
Token ClassificationRevise
References: examples/tensorflow/token-classification
The token classification examples in the …/token-classification
directory demonstrate how to fine-tune Transformer-based models for token classification tasks, such as Named Entity Recognition (NER), using TensorFlow.
TranslationRevise
References: examples/tensorflow/translation
The translation examples in the …/translation
directory demonstrate how to fine-tune Transformer-based models for translation tasks using the Hugging Face Transformers library in TensorFlow.
Research Project ExamplesRevise
References: transformers
The …/research_projects
directory contains a collection of research projects and experiments that showcase the extensibility and versatility of the Transformers library. These projects cover a wide range of advanced natural language processing and machine learning tasks, demonstrating the library's ability to be adapted and extended for cutting-edge research.