Create your own wiki
AI-generated instantly
Updates automatically
Solo and team plans

composer

repo logo
composer
Language
Python
Created
10/12/2021
Last updated
04/27/2024
License
Apache License 2.0
autowiki
Software Version
0.0.8Basic
Generated from
Commit
ba84d8
Generated on
04/27/2024
• • •
Architecture Diagram for composer
Architecture Diagram for composer

The Composer repository is a powerful machine learning framework that provides a flexible and extensible training pipeline for building and optimizing high-performance models. At its core, Composer offers a comprehensive set of algorithms, callbacks, and utilities that can be seamlessly integrated into the training process to improve efficiency, robustness, and overall model performance.

The most important components of the Composer library are located in the composer directory. This directory contains the central Trainer class, which is responsible for coordinating the execution of the training process, as well as the Algorithm and Callback base classes that allow users to define custom training algorithms and monitoring components.

The …/algorithms directory is a standout feature of the Composer library, housing a diverse collection of efficiency-enhancing algorithms. These algorithms cover a wide range of techniques, including data augmentation, layer transformations, gradient clipping and normalization, and position encoding and sequence length handling. By applying these algorithms during training, users can significantly improve the performance and robustness of their models.

Another crucial aspect of the Composer library is its callback system, which is implemented in the …/callbacks directory. This system allows users to customize and monitor the training process by defining their own callback classes that hook into various events during the training loop. The callbacks can be used for tasks such as monitoring activation values, logging evaluation outputs, and managing checkpoints and early stopping.

The Composer library also provides a comprehensive logging infrastructure in the …/loggers directory, which enables users to log various types of data to different platforms and services during the training process. This includes support for local file logging, in-memory logging, and integration with popular platforms like Weights and Biases and Neptune.ai.

To support the training of machine learning models, the Composer library includes the …/models directory, which contains the ComposerModel base class and the HuggingFaceModel class for integrating Hugging Face Transformer models. The …/datasets directory, on the other hand, provides the core functionality for working with various in-context learning datasets, including question-answering, language modeling, and code evaluation tasks.

Finally, the …/trainer directory houses the core Trainer class, which is responsible for managing the overall training process, including tasks such as distributed training, checkpoint management, and integration with external libraries like DeepSpeed.

In summary, the Composer repository is a powerful and flexible machine learning framework that provides a comprehensive set of tools and utilities for building and optimizing high-performance models. By leveraging the various algorithms, callbacks, and logging capabilities, users can supercharge their model training and achieve state-of-the-art results.

Core Functionality

References: composer/core

• • •
Architecture Diagram for Core Functionality
Architecture Diagram for Core Functionality

The central components and utilities that enable the training pipeline in the Composer library include the core Engine, base classes for Algorithms and Callbacks, and various data, model, and precision management utilities.

Read more

Engine and Training Coordination

References: composer/core/engine.py, composer/core/event.py, composer/core/passes.py

• • •
Architecture Diagram for Engine and Training Coordination
Architecture Diagram for Engine and Training Coordination

The Engine class is the core component that drives the training process in the Composer library. It is responsible for coordinating the execution of algorithms and callbacks during the training loop.

Read more

Algorithms and Callbacks

References: composer/core/algorithm.py, composer/core/callback.py

• • •
Architecture Diagram for Algorithms and Callbacks
Architecture Diagram for Algorithms and Callbacks

The base classes for defining custom Algorithms and Callbacks, which can be used to extend the training pipeline, are defined in the …/algorithm.py and …/callback.py files, respectively.

Read more

Data and Evaluation Management

References: composer/core/data_spec.py, composer/core/evaluator.py, composer/core/precision.py

• • •
Architecture Diagram for Data and Evaluation Management
Architecture Diagram for Data and Evaluation Management

The DataSpec class, defined in …/data_spec.py, serves as a container for information about a dataset used in training or evaluation. It provides a standardized way to work with different data loaders, handling details like batch size, number of samples, and device-specific transformations.

Read more

Serialization and State Management

References: composer/core/serializable.py, composer/core/state.py

• • •
Architecture Diagram for Serialization and State Management
Architecture Diagram for Serialization and State Management

The Serializable class in …/serializable.py is the primary component for serializing and deserializing objects in the Composer library. This class defines two key methods:

Read more

Utilities and Helpers

References: composer/core/time.py, composer/core/types.py

• • •
Architecture Diagram for Utilities and Helpers
Architecture Diagram for Utilities and Helpers

The …/time.py file provides utility classes and functions for managing time-related aspects of the training process in the Composer library.

Read more

Algorithms and Techniques

References: composer/algorithms, tests/algorithms

• • •
Architecture Diagram for Algorithms and Techniques
Architecture Diagram for Algorithms and Techniques

The Composer library provides a wide range of algorithms and techniques to improve the efficiency, robustness, and performance of machine learning models. These algorithms can be applied during the training process to enhance various aspects of the models.

Read more

Data Augmentation Algorithms

References: composer/algorithms/mixup, composer/algorithms/cutmix, composer/algorithms/augmix

• • •
Architecture Diagram for Data Augmentation Algorithms
Architecture Diagram for Data Augmentation Algorithms

The Composer library provides several data augmentation algorithms that can improve the robustness and generalization performance of machine learning models. These include MixUp, CutMix, and AugMix.

Read more

Layer Transformations

References: composer/algorithms/squeeze_excite, composer/algorithms/gated_linear_units, composer/algorithms/blurpool

The Composer library provides several algorithms that apply various layer-level transformations to PyTorch models, which can improve the efficiency and performance of the models.

Read more

Gradient Clipping and Normalization

References: composer/algorithms/gradient_clipping, composer/algorithms/ghost_batchnorm

• • •
Architecture Diagram for Gradient Clipping and Normalization
Architecture Diagram for Gradient Clipping and Normalization

The …/gradient_clipping directory contains the core functionality for applying gradient clipping to PyTorch models. The main components are:

Read more

Regularization Techniques

References: composer/algorithms/stochastic_depth, composer/algorithms/label_smoothing, composer/algorithms/ema

• • •
Architecture Diagram for Regularization Techniques
Architecture Diagram for Regularization Techniques

This subsection discusses the regularization techniques implemented in the Composer library, such as Stochastic Depth, Label Smoothing, and Exponential Moving Average (EMA), which can help improve the generalization performance of the trained models.

Read more

Position Encoding and Sequence Length Handling

References: composer/algorithms/alibi, composer/algorithms/seq_length_warmup

• • •
Architecture Diagram for Position Encoding and Sequence Length Handling
Architecture Diagram for Position Encoding and Sequence Length Handling

The …/alibi directory contains the implementation of the ALiBi (Attention with Linear Biases) algorithm, which is a technique for encoding position information in transformer-based NLP models without using position embeddings.

Read more

Low-Precision Computation

References: composer/algorithms/low_precision_layernorm, composer/algorithms/low_precision_groupnorm

• • •
Architecture Diagram for Low-Precision Computation
Architecture Diagram for Low-Precision Computation

The Composer library provides two algorithms that enable low-precision computation: Low-Precision Layer Normalization and Low-Precision Group Normalization. These algorithms can improve the performance of models on certain hardware, such as GPUs with Tensor Cores, without significantly impacting model accuracy.

Read more

Layer Freezing

References: composer/algorithms/layer_freezing

• • •
Architecture Diagram for Layer Freezing
Architecture Diagram for Layer Freezing

The Layer Freezing algorithm, implemented in the …/layer_freezing directory, is a technique used to improve the performance and efficiency of deep learning models by progressively freezing the earlier layers of the network during training.

Read more

Other Techniques

References: composer/algorithms/selective_backprop, composer/algorithms/progressive_resizing

• • •
Architecture Diagram for Other Techniques
Architecture Diagram for Other Techniques

The SelectiveBackprop class in the …/ directory implements the Selective Backprop (SB) algorithm, which is a technique to speed up the training of deep learning models by selectively backpropagating on examples with high loss. The key components are:

Read more

Callbacks and Logging

References: composer/callbacks, tests/callbacks, composer/loggers, tests/loggers

• • •
Architecture Diagram for Callbacks and Logging
Architecture Diagram for Callbacks and Logging

The Composer library provides a flexible callback system and comprehensive logging infrastructure that allows users to customize and monitor the training process.

Read more

Callback System

References: composer/callbacks

• • •
Architecture Diagram for Callback System
Architecture Diagram for Callback System

The Composer library provides a flexible callback system that allows users to customize and monitor the training process. The core Callback class defines the interface for all callbacks, and the library includes a wide range of concrete callback implementations, each with a specific purpose.

Read more

Callback Base Class

References: composer/core/callback.py

• • •
Architecture Diagram for Callback Base Class
Architecture Diagram for Callback Base Class

The Callback base class defines the interface for all callbacks in the Composer library. Callbacks provide a way to add custom functionality to the training pipeline without directly modifying the core training logic.

Read more

Monitoring Callbacks

References: composer/callbacks/activation_monitor.py, composer/callbacks/memory_monitor.py, composer/callbacks/system_metrics_monitor.py, composer/callbacks/optimizer_monitor.py

The Composer library provides several monitoring callbacks that track and log various aspects of the training process, including activation values, memory usage, system metrics, and optimizer behavior.

Read more

Logging Callbacks

References: composer/callbacks/eval_output_logging_callback.py, composer/callbacks/image_visualizer.py, composer/callbacks/generate.py

• • •
Architecture Diagram for Logging Callbacks
Architecture Diagram for Logging Callbacks

The Composer library provides a comprehensive logging infrastructure that allows users to log various types of data during the training and evaluation process. This includes callbacks that log metrics, images, and evaluation outputs.

Read more

Checkpoint and Early Stopping Callbacks

References: composer/callbacks/checkpoint_saver.py, composer/callbacks/early_stopper.py, composer/callbacks/threshold_stopper.py

• • •
Architecture Diagram for Checkpoint and Early Stopping Callbacks
Architecture Diagram for Checkpoint and Early Stopping Callbacks

The Composer library provides two key callbacks for handling model checkpointing and early stopping during the training process: CheckpointSaver and EarlyStopper.

Read more

Inference and Export Callbacks

References: composer/callbacks/export_for_inference.py

• • •
Architecture Diagram for Inference and Export Callbacks
Architecture Diagram for Inference and Export Callbacks

The ExportForInferenceCallback class in the …/export_for_inference.py file is responsible for exporting the trained model in various formats for inference. This callback allows users to save the model in either TorchScript or ONNX format, which can be useful for deploying the model in production environments.

Read more

Utility Callbacks

References: composer/callbacks/free_outputs.py, composer/callbacks/mlperf.py

• • •
Architecture Diagram for Utility Callbacks
Architecture Diagram for Utility Callbacks

The FreeOutputs class is a utility callback that helps reduce the peak memory usage during the training process. It is designed to be used in conjunction with the Composer library, which provides a high-level interface for training machine learning models.

Read more

Logging Infrastructure

References: composer/loggers

• • •
Architecture Diagram for Logging Infrastructure
Architecture Diagram for Logging Infrastructure

The Composer library provides a flexible and extensible logging infrastructure for recording various types of data during the model training process. The main components in this infrastructure are:

Read more

Logger and LoggerDestination

References: composer/loggers/logger.py, composer/loggers/logger_destination.py

• • •
Architecture Diagram for Logger and LoggerDestination
Architecture Diagram for Logger and LoggerDestination

The Logger class and the abstract LoggerDestination base class define the logging interface and routing in the Composer library.

Read more

Concrete Logger Implementations

References: composer/loggers/file_logger.py, composer/loggers/in_memory_logger.py, composer/loggers/wandb_logger.py, composer/loggers/neptune_logger.py

• • •
Architecture Diagram for Concrete Logger Implementations
Architecture Diagram for Concrete Logger Implementations

The Composer library provides several concrete LoggerDestination implementations that handle logging to different destinations, including files, in-memory storage, and external platforms like Weights and Biases (WandB) and Neptune.ai.

Read more

Remote File Management

References: composer/loggers/remote_uploader_downloader.py

• • •
Architecture Diagram for Remote File Management
Architecture Diagram for Remote File Management

The RemoteUploaderDownloader class is responsible for handling the uploading and downloading of files to and from remote object stores, such as AWS S3 or Google Cloud Storage. This class implements the LoggerDestination interface, allowing it to be used as a destination for logging data during the training process.

Read more

Utility Loggers

References: composer/loggers/progress_bar_logger.py, composer/loggers/slack_logger.py

• • •
Architecture Diagram for Utility Loggers
Architecture Diagram for Utility Loggers

The ProgressBarLogger class, defined in …/progress_bar_logger.py, is responsible for logging metrics to the console and displaying a progress bar during training and evaluation. It uses the TQDM library to create and update the progress bar, which shows information such as the current epoch, batch, and relevant metrics. The logger can be configured to log to either stdout or stderr, and can also be set to log traces. It automatically handles progress bars for both training and evaluation, adjusting the display based on whether the duration is measured in epochs or other time units.

Read more

Trainer and Training Utilities

References: composer/trainer, tests/trainer

• • •
Architecture Diagram for Trainer and Training Utilities
Architecture Diagram for Trainer and Training Utilities

The Trainer and Training Utilities section of the Composer library focuses on the core Trainer class and related functionality, including distributed training, checkpoint management, and integration with external libraries like DeepSpeed.

Read more

Trainer Functionality

References: composer

• • •
Architecture Diagram for Trainer Functionality
Architecture Diagram for Trainer Functionality

The core functionality of the Trainer class in the Composer library is to drive the training and evaluation process for machine learning models. The Trainer class handles various aspects of the training pipeline, including initialization, training, evaluation, and checkpoint management.

Read more

Distributed Training

References: composer/trainer/dist_strategy.py, tests/trainer/test_ddp.py, tests/trainer/test_fsdp.py, tests/trainer/test_fsdp_checkpoint.py, tests/trainer/test_fsdp_param_groups.py

• • •
Architecture Diagram for Distributed Training
Architecture Diagram for Distributed Training

The Composer library provides robust support for distributed training using Distributed Data Parallelism (DDP) and Fully Sharded Data Parallelism (FSDP).

Read more

Checkpoint Management

References: tests/trainer/test_checkpoint.py

The Composer library provides robust checkpoint management functionality, allowing users to save, load, and resume training from checkpoints. This subsection covers the key aspects of this functionality, including support for different file formats, remote storage, and distributed training.

Read more

Data Specification and Prediction

References: tests/trainer/test_dataspec.py, tests/trainer/test_predict.py

• • •
Architecture Diagram for Data Specification and Prediction
Architecture Diagram for Data Specification and Prediction

The DataSpec class in the Composer library is responsible for handling different batch data formats and calculating the number of samples and tokens in a batch. It provides a consistent interface for working with various types of batch data, such as dictionaries, lists, tuples, and tensors.

Read more

Integration with External Libraries

References: composer/trainer/_deepspeed.py

• • •
Architecture Diagram for Integration with External Libraries
Architecture Diagram for Integration with External Libraries

The Composer library provides seamless integration with the DeepSpeed library, a popular deep learning optimization library, to enable efficient and scalable training of machine learning models. The integration is handled in the …/_deepspeed.py file, which includes several helper functions to ensure compatibility between the Mosaic trainer and the DeepSpeed configuration.

Read more

Learning Rate Schedule Scaling

References: composer/trainer/_scale_schedule.py, tests/trainer/test_scale_schedule.py

• • •
Architecture Diagram for Learning Rate Schedule Scaling
Architecture Diagram for Learning Rate Schedule Scaling

The scale_pytorch_scheduler() function in the …/_scale_schedule.py file is a utility function that allows you to scale the duration of various PyTorch learning rate scheduler objects. This is useful when you want to train a model for a shorter or longer period of time than the original learning rate schedule was designed for, while still maintaining the overall shape of the learning rate curve.

Read more

Trainer Evaluation

References: tests/trainer/test_trainer_eval.py

• • •
Architecture Diagram for Trainer Evaluation
Architecture Diagram for Trainer Evaluation

The Composer library's Trainer class provides a comprehensive evaluation functionality, allowing users to monitor and assess the performance of their machine learning models during the training process. This subsection covers the key aspects of the Trainer evaluation, including the handling of different eval_interval and max_duration settings, non-divisible datasets, and infinite eval_dataloader objects.

Read more

Profiling and Performance Analysis

References: composer/profiler

• • •
Architecture Diagram for Profiling and Performance Analysis
Architecture Diagram for Profiling and Performance Analysis

The Composer library provides a comprehensive profiling functionality that allows users to collect and analyze performance data during the training process. The core component of this functionality is the Profiler class, which serves as the main entry point for configuring and managing the profiling process.

Read more

Profiler Configuration and Management

References: composer/profiler/profiler.py

The Profiler class is the main entry point for profiling in the Composer library. It is responsible for configuring and managing the profiling process during the training of machine learning models.

Read more

Trace Handlers

References: composer/profiler/trace_handler.py

• • •
Architecture Diagram for Trace Handlers
Architecture Diagram for Trace Handlers

The TraceHandler class is an abstract base class that defines the interface for handling profiler trace events in the Composer library. It provides methods for processing different types of trace events, such as duration events, instant events, and counter events. Subclasses of TraceHandler must implement these methods to handle the recording and saving of profiling data in different formats.

Read more

Markers and Annotations

References: composer/profiler/marker.py

• • •
Architecture Diagram for Markers and Annotations
Architecture Diagram for Markers and Annotations

The Marker class in the …/marker.py file is a key component of the Composer library's profiling system. It provides a flexible and extensible way to measure and record various events during the training process.

Read more

System Profiling

References: composer/profiler/system_profiler.py

• • •
Architecture Diagram for System Profiling
Architecture Diagram for System Profiling

The SystemProfiler class in the …/system_profiler.py file is responsible for collecting host-level metrics during the training process. This class is a subclass of the Callback class from the …/core.py module, which allows it to be integrated into the Composer training pipeline.

Read more

PyTorch Profiling

References: composer/profiler/torch_profiler.py

• • •
Architecture Diagram for PyTorch Profiling
Architecture Diagram for PyTorch Profiling

The TorchProfiler class, defined in …/torch_profiler.py, is responsible for collecting PyTorch-specific profiling data during the training process. This includes information about the execution order, latency, and attributes of PyTorch operators and GPU kernels.

Read more

Profiling Utilities

References: composer/profiler/utils.py

• • •
Architecture Diagram for Profiling Utilities
Architecture Diagram for Profiling Utilities

The composer/composer/profiler/utils.py file provides utility functions for working with the PyTorch profiler, particularly for generating a visual report of the memory usage timeline during a profiling session.

Read more

Datasets and Models

References: composer/datasets, composer/models

• • •
Architecture Diagram for Datasets and Models
Architecture Diagram for Datasets and Models

The Composer library provides a comprehensive set of dataset and model-related components that enable flexible and efficient machine learning workflows. The key components in this area are the InContextLearningDataset and ComposerModel classes.

Read more

InContextLearningDataset and Subclasses

References: composer/datasets

• • •
Architecture Diagram for InContextLearningDataset and Subclasses
Architecture Diagram for InContextLearningDataset and Subclasses

The InContextLearningDataset class and its subclasses provide the core functionality for working with various in-context learning datasets in the Composer library. These datasets are designed to support a wide range of task types, including question-answering, language modeling, multiple choice, schema-based, and code evaluation tasks.

Read more

HuggingFaceModel

References: composer/models/huggingface.py

The HuggingFaceModel class is a wrapper around Hugging Face Transformer models, allowing them to be used within the Composer framework. This class handles the necessary adaptations and provides additional functionality for training and evaluation.

Read more

ComposerModel and Initializer

References: composer/models/base.py, composer/models/initializers.py

The ComposerModel class is the abstract base class for all models used in the Composer framework. It provides the necessary methods and attributes for a model to be used with the composer.Trainer class.

Read more

ComposerClassifier

References: composer/models/tasks/classification.py

• • •
Architecture Diagram for ComposerClassifier
Architecture Diagram for ComposerClassifier

The ComposerClassifier class is a subclass of ComposerModel that provides a convenient way to create a ComposerModel for classification tasks using a vanilla PyTorch model. The key features of the ComposerClassifier class are:

Read more

Utilities and Helpers

References: composer/utils

• • •
Architecture Diagram for Utilities and Helpers
Architecture Diagram for Utilities and Helpers

The Composer library provides a comprehensive set of utility functions and helper classes that cover a wide range of functionality, including file and object store management, inference and model export, and miscellaneous functionality.

Read more

Checkpoint Management

References: composer/utils/checkpoint.py

• • •
Architecture Diagram for Checkpoint Management
Architecture Diagram for Checkpoint Management

The Composer library provides comprehensive utilities for saving and loading training checkpoints, both in a monolithic format and in a sharded format for use with Fully Sharded Data Parallelism (FSDP).

Read more

Environment Reporting

References: composer/utils/collect_env.py

• • •
Architecture Diagram for Environment Reporting
Architecture Diagram for Environment Reporting

The collect_env.py file in the …/ directory provides a set of utility functions to gather system information for debugging and bug reporting purposes. This information includes the Composer commit hash, host processor details, and accelerator information.

Read more

File and Object Store Management

References: composer/utils/file_helpers.py

• • •
Architecture Diagram for File and Object Store Management
Architecture Diagram for File and Object Store Management

The file_helpers.py module in the Composer library provides a set of utility functions for working with files, both local and remote. The module supports various cloud storage backends, including S3, GCS, OCI, and Azure Blob Storage, as well as local file systems.

Read more

Inference and Model Export

References: composer/utils/inference.py

• • •
Architecture Diagram for Inference and Model Export
Architecture Diagram for Inference and Model Export

The composer/composer/utils/inference.py file provides utility functions and classes for exporting PyTorch models to various formats, such as TorchScript or ONNX, with optional optimizations and transformations.

Read more

Miscellaneous Helpers

References: composer/utils/misc.py

The misc.py file in the Composer library provides a collection of miscellaneous utility functions and helpers. These utilities cover a range of functionality, including:

Read more

Module Surgery

References: composer/utils/module_surgery.py

• • •
Architecture Diagram for Module Surgery
Architecture Diagram for Module Surgery

The key functionality in the file …/module_surgery.py is the replace_module_classes() function, which allows specific module types within a PyTorch module to be replaced with custom implementations.

Read more

Reproducibility

References: composer/utils/reproducibility.py

• • •
Architecture Diagram for Reproducibility
Architecture Diagram for Reproducibility

The reproducibility.py file in the Composer library provides a set of utilities to help ensure deterministic training and reproducibility in machine learning models.

Read more

Retrying

References: composer/utils/retrying.py

• • •
Architecture Diagram for Retrying
Architecture Diagram for Retrying

The retry decorator function provided in …/retrying.py allows you to retry a function call with backoff and jitter, which can be useful for handling flaky or intermittent failures in your code. This decorator function can be used in two ways:

Read more

String Enum

References: composer/utils/string_enum.py

• • •
Architecture Diagram for String Enum
Architecture Diagram for String Enum

The StringEnum class is a base class that extends the built-in Enum class to provide a foundation for creating string-based enumerations. This class enforces consistent naming conventions and enables case-insensitive matching, making it a useful tool for managing string-based configuration options and other string-based values in a robust and maintainable way.

Read more

Warnings

References: composer/utils/warnings.py

• • •
Architecture Diagram for Warnings
Architecture Diagram for Warnings

The VersionedDeprecationWarning class is a custom deprecation warning class that extends the built-in DeprecationWarning class. This class is used to provide more informative deprecation warnings to users, including the version in which a deprecated feature will be removed.

Read more

Object Store Management

References: composer/utils/object_store

• • •
Architecture Diagram for Object Store Management
Architecture Diagram for Object Store Management

The …/object_store directory provides a set of classes and utilities for interacting with various object storage systems, such as Google Cloud Storage (GCS), Amazon S3, Oracle Cloud Infrastructure (OCI) Object Storage, and more.

Read more

Evaluation Client

References: composer/utils/eval_client

• • •
Architecture Diagram for Evaluation Client
Architecture Diagram for Evaluation Client

The …/eval_client directory provides utility classes and functions for evaluating code in various environments, including local machines, AWS Lambda functions, and the MosaicML platform.

Read more