repo logo
fastai
Language
Jupyter Notebook
Created
09/09/2017
Last updated
01/06/2024
License
Apache License 2.0
autowiki
Software Version
0.0.4Basic
Generated from
Commit
7723c4
Generated on
01/06/2024

fastai
[Edit section]
[Copy link]

The fastai repository provides a high-level deep learning library built on top of PyTorch. It aims to make deep learning more accessible and productive through its design, abstractions, and features.

Some of the key aspects of fastai include:

  • Provides domain-specific libraries for computer vision, natural language processing, tabular data, and collaborative filtering that handle common tasks like loading data and defining models Computer Vision Natural Language Processing Tabular Data

  • Implements a flexible callback system that allows injecting arbitrary code during model training like learning rate scheduling, mixed precision, and regularization Callbacks

  • Contains a training loop abstraction that handles optimization, losses, metrics, and other training mechanics in a consistent way Model Training

  • Provides utilities for loading, splitting, labeling, encoding, normalizing and transforming various types of data Data Loading and Preprocessing

  • Implements distributed training functionality to train models across multiple GPUs/machines Distributed Training

  • Supports mixed precision training using float16 to accelerate training on GPUs Mixed Precision

  • Contains tools for model interpretation, analysis and debugging like visualization and identifying top losses Interpretability

The key design choices are composing domain-specific libraries on top of a flexible core, providing both high-level abstractions while also allowing detailed customization via the callback system. The libraries build on PyTorch and leverage its capabilities.

Computer Vision
[Edit section]
[Copy link]

References: fastai/vision, dev_nbs/course, nbs/examples

The core functionality provided in the fastai library for computer vision allows training common CNN models on image data with a simplified and optimized training process. The …/vision package handles this end-to-end, from loading and preprocessing image datasets to defining models to a high-level training loop.

Read more

Data Loading
[Edit section]
[Copy link]

References: fastai/vision/data.py, dev_nbs/course

The main classes for loading and preprocessing image data are defined in …/data.py. The class provides methods for loading image data into PyTorch DataLoaders from various sources like folders, lists, and DataFrames. It handles details like preprocessing, normalization, and splitting data into training, validation, and test sets.

Read more

Models
[Edit section]
[Copy link]

References: fastai/vision/models, fastai/vision/models/__init__.py, fastai/vision/models/all.py

The …/models directory contains implementations of common computer vision models through well-defined classes.

Read more

Data Augmentation
[Edit section]
[Copy link]

References: fastai/vision/augment.py

The …/augment.py file contains implementations of common image augmentation techniques that can be randomly applied during training. Core functionality is provided for randomly applying transforms. Classes inherit functionality for applying transforms to images.

Read more

Natural Language Processing
[Edit section]
[Copy link]

References: fastai/text, dev_nbs

The …/text directory provides utilities for common natural language processing tasks like text classification and language modeling. It contains functionality for preprocessing text data, creating data loaders and defining neural network models for NLP problems.

Read more

Models
[Edit section]
[Copy link]

References: fastai/text/models/awdlstm.py, fastai/text/models/core.py

The …/models directory contains implementations of common neural networks for natural language processing tasks. The core implementation is defined in …/awdlstm.py.

Read more

Training
[Edit section]
[Copy link]

References: fastai/text/learner.py

The training loop calculates losses using the model's predictions on batches of inputs. It applies the specified optimizer to minimize these losses over epochs. Callbacks can be added to customize training. For example, one callback implements a one-cycle learning rate schedule.

Read more

Tabular Data
[Edit section]
[Copy link]

References: fastai/tabular, nbs/examples

The fastai library provides a set of tools for building, training, and evaluating machine learning models on structured tabular data. The core functionality is centered around preprocessing tabular data stored in Pandas DataFrames, defining common tabular model architectures, and abstracting the training loop into a learner class tailored for tabular tasks.

Read more

Training Loop
[Edit section]
[Copy link]

References: fastai/tabular/learner.py, fastai/callback/all.py

The …/learner.py file handles training tabular models. It constructs data loaders from input data and passes batches to the model during optimization.

Read more

Data Loading and Preprocessing
[Edit section]
[Copy link]

References: fastai/data, fastai/text, fastai/tabular, fastai/vision

The fastai library provides extensive functionality for loading, preprocessing, and transforming various types of data for deep learning tasks. This functionality is implemented across several key modules and files in the library.

Read more

Data Loading
[Edit section]
[Copy link]

References: fastai/data/load.py, fastai/data/external.py

The core functionality for loading data from various sources into PyTorch datasets and dataloaders is handled by code in the …/load.py file. This file contains implementations of the main objects used for loading data.

Read more

Data Splits
[Edit section]
[Copy link]

References: fastai/data/load.py

The …/load.py file contains functionality for splitting data into training and validation sets when loading data. It supports passing a validation split ratio.

Read more

Labeling
[Edit section]
[Copy link]

References: fastai/data/load.py

The …/load.py file contains functionality for labeling and encoding targets as part of the data loading process. Errors during the labeling process are caught and informative errors are raised to help with debugging. The labeling functionality provides a consistent interface that works across different types of data and tasks.

Read more

Transforms
[Edit section]
[Copy link]

References: fastai/data/transforms.py, fastai/vision/augment.py

The …/transforms.py file contains utilities for loading, splitting, and transforming datasets.

Read more

Pipelines
[Edit section]
[Copy link]

References: fastai/data/block.py

The file …/block.py contains classes and functions for building data pipelines from a data source.

Read more

Downloads
[Edit section]
[Copy link]

References: fastai/data/external.py, fastai/data/download_checks.py

The main functionality for downloading external datasets is handled in …/external.py. This file contains a constants that centralizes URLs. It also contains a function for retrieving configuration settings and constructing download paths.

Read more

Model Training
[Edit section]
[Copy link]

References: fastai, nbs/examples

The …/learner.py file provides high-level utilities for training PyTorch models. The core class is Learner, which combines a model, data loaders (…/load.py), loss function, and callbacks into a single object. Its main methods orchestrate the overall training loop by calling callbacks at appropriate points.

Read more

Training Loop
[Edit section]
[Copy link]

References: fastai/learner.py

The training loop handles the overall flow of training. It contains the model, data loaders, loss function, optimizer, and callbacks. During training, it orchestrates the process by calling callbacks at each step.

Read more

Callbacks
[Edit section]
[Copy link]

References: fastai/callback, fastai/callback/core.py

Callbacks allow injecting custom logic into the training loop at different points. Key callbacks customize training by running code at the start and end of epochs.

Read more

Optimizers
[Edit section]
[Copy link]

References: fastai/optimizer.py

The …/optimizer.py file provides implementations of common optimizers for updating model weights during training. Optimizers are implemented by composing callback functions that define the optimization steps.

Read more

Learning Rate Schedulers
[Edit section]
[Copy link]

References: fastai/optimizer.py

The …/optimizer.py file implements various learning rate schedules for fast and effective model training. Learning rate schedules allow dynamically adjusting the learning rate during training to improve optimization.

Read more

Losses
[Edit section]
[Copy link]

References: fastai/losses.py

The …/losses.py file provides commonly used loss functions for training deep learning models on different types of tasks.

Read more

Metrics
[Edit section]
[Copy link]

References: fastai/metrics.py

The …/metrics.py file provides implementations of many common machine learning metrics for monitoring model training. It contains both individual metric functions and subclasses that accumulate metrics over batches.

Read more

Utilities
[Edit section]
[Copy link]

References: fastai/torch_core.py

This section covers helper functions provided in …/torch_core.py that simplify training PyTorch models. Key functionality includes:

Read more

Distributed Training
[Edit section]
[Copy link]

References: fastai/distributed.py

The …/distributed.py file provides functionality for distributing model training across multiple GPUs or machines. It handles wrapping models and data for parallel computation.

Read more

Mixed Precision
[Edit section]
[Copy link]

References: fastai

Mixed precision training with float16 can accelerate training on GPUs by performing operations with lower precision numbers while still tracking the model parameters in float32 for better accuracy. This allows utilizing the GPU's tensor cores which provide a significant speedup for float16 operations.

Read more

Callbacks
[Edit section]
[Copy link]

References: fastai/callback, nbs/examples

The core functionality of callbacks in fastai is to customize model training by injecting logic at different points in the training loop. Callbacks allow injecting code before, after, or during batches, epochs, and entire training runs. This provides a flexible way to implement techniques like learning rate scheduling, regularization, mixed precision training, and distributed training without modifying the core training loop code.

Read more

Data Augmentation Callbacks
[Edit section]
[Copy link]

References: fastai/callback/mixup.py

The …/mixup.py file implements callbacks for data augmentation techniques during model training.

Read more

Optimization Callbacks
[Edit section]
[Copy link]

References: fastai/callback/schedule.py, fastai/callback/tracker.py

The …/schedule.py file implements callbacks that customize optimization during training by adjusting hyperparameters like the learning rate.

Read more

Regularization Callbacks
[Edit section]
[Copy link]

References: fastai

The main regularization callback implemented in fastai applies weight decay during training. This callback calculates the L2 norm of each parameter after the backward pass. The L2 norm is accumulated as a regularization loss term, which gets optimized along with the main training loss function. This helps prevent overfitting by discouraging reliance on a few strong weights.

Read more

Logging Callbacks
[Edit section]
[Copy link]

References: fastai/callback/progress.py

The …/progress.py file contains callbacks that handle logging training progress and metrics to files. These callbacks provide a consistent interface for monitoring and recording a model's training progress.

Read more

Model Analysis Callbacks
[Edit section]
[Copy link]

References: fastai/callback/hook.py

This section covers callbacks that can be used during model training for interpretation, analysis, and debugging purposes. The …/hook.py file contains utilities that allow inspecting and analyzing models.

Read more

Distributed Training Callbacks
[Edit section]
[Copy link]

References: fastai

The …/distributed.py file contains utilities for distributing training across multiple GPUs or machines. It provides functionality for wrapping models and handling distributed training.

Read more

FP16 Training Callbacks
[Edit section]
[Copy link]

References: fastai/callback/fp16.py

The …/fp16.py file contains callbacks that enable mixed precision training with float16. During mixed precision training, parameters are stored in float16 format to save memory and speed up computation, while gradients are kept in float32 for numerical stability.

Read more

Medical Imaging Callbacks
[Edit section]
[Copy link]

References: fastai/medical/imaging.py

The …/imaging.py file contains specialized callbacks for preprocessing medical imaging data during model training. Metadata and pixel data can be represented differently.

Read more

Interpretability
[Edit section]
[Copy link]

References: fastai

The fastai library provides tools for interpreting models implemented in …/interpret.py.

Read more

Model Analysis
[Edit section]
[Copy link]

References: fastai/callback/hook.py

The …/hook.py file contains utilities for analyzing models during and after training. It provides functions for inspecting models by passing dummy data through them.

Read more

Model Debugging
[Edit section]
[Copy link]

References: fastai/callback/hook.py, fastai/interpret.py

This section covers identifying and debugging errors in models. The …/hook.py file contains utilities for inspecting models.

Read more

Visualization
[Edit section]
[Copy link]

References: fastai/callback/hook.py, fastai/interpret.py

The …/hook.py file contains utilities for visualizing models.

Read more

Metrics
[Edit section]
[Copy link]

References: fastai/metrics.py

The …/metrics.py file contains implementations of many common machine learning metrics. It provides individual metric functions as well as the class for accumulating metrics over batches during training.

Read more

Losses
[Edit section]
[Copy link]

References: fastai/losses.py

The …/losses.py file provides loss functions that can be used for model analysis during and after training. It contains common losses that are useful for training deep learning models. Additionally, it implements losses designed for semantic segmentation tasks.

Read more

Medical Data
[Edit section]
[Copy link]

References: fastai/medical

The …/medical directory contains functionality for medical imaging and text data. It provides utilities for loading, preprocessing, and analyzing medical image and text data.

Read more