Mutable.ai logoAuto Wiki by Mutable.ai

faceswap

Auto-generated from deepfakes/faceswap by Mutable.ai Auto Wiki

faceswap
GitHub Repository
Developerdeepfakes
Written inPython
Stars48k
Watchers 1.5k
Created2017-12-19
Last updated2023-12-28
LicenseGNU General Public License v3.0
Homepagewww.faceswap.dev
Repositorydeepfakes/faceswap
Auto Wiki
Generated at2023-12-28
Generated fromCommit a62a85
Version0.0.4

Faceswap is an open source tool for swapping faces in images and videos using deep learning. It provides a full pipeline for extracting faces from input media, training neural network models to generate face swaps, and running inference to convert new media.

The key components of Faceswap include:

  • Extraction pipeline plugins in …/extract that handle detecting, aligning, and masking faces using computer vision techniques. The Extractor class orchestrates the pipeline.

  • Model architectures in …/model like autoencoders and GANs that learn to generate fake face swaps. The ModelBase class provides a common interface.

  • Training workflows in …/trainer that implement training loops for models using Keras and TensorFlow. The TrainerBase class defines the core training API.

  • Conversion pipeline plugins in …/convert that handle blending predicted faces into target frames and videos after inference. This includes mask blending, color/scaling adjustment, and output writing plugins.

  • Tools in tools like the manual annotator, alignments editor, preview tool, and model loader that provide reusable functionality.

  • The GUI in …/gui implemented with Tkinter that enables an intuitive workflow for users.

  • The command line interface in …/cli powered by argparse for automation.

  • Shared library code in lib for alignments, models, training, and system information gathering.

  • Automated installation scripts in .install for setting up dependencies.

  • Documentation generated from docstrings using Sphinx in docs.

  • Unit and integration tests in tests using Pytest.

The key design choices are the plugin model which enables swapping components in/out, inheritance from base classes to promote code reuse, and separation of concerns between the pipeline stages. Together these allow rapid experimentation and extension of Faceswap's deepfake capabilities.

Extraction Pipeline

References: plugins/extract, lib/align

The extraction pipeline handles extracting faces from input media using detection, alignment, masking, and recognition plugins. It is implemented primarily in the …/pipeline.py file.

The core Extractor class defined in this file orchestrates running the full extraction pipeline. It initializes the necessary plugins for each task like detection, alignment, etc by calling methods like _load_detect().

It determines the optimal processing order and batching of plugins to fit within GPU memory constraints. It may split the pipeline into multiple phases if needed. The input_queue and detected_faces() properties provide the interface to feed images in and retrieve results.

The pipeline leverages several classes to pass data between stages. The ExtractMedia class encapsulates the image, detected faces, and other metadata being processed at each stage.

Plugins are loaded using the PluginLoader class. Plugins then inherit from base classes defined in subdirectories like …/_base.py which standardize the interface.

The overall goal of the extraction pipeline is to efficiently run the full sequence of detection, alignment, masking, and recognition tasks within the constraints of the system hardware. It aims to maximize performance while minimizing complexity for plugin developers.

Face Detection

References: plugins/extract/detect

The core face detection functionality in Faceswap relies on detector plugins that inherit from the abstract Detector base class defined in …/_base.py. This class provides a common interface for loading models, preprocessing input images, applying rotations, and postprocessing detections. Concrete detector implementations then inherit from this base class while focusing on their specialized model loading and prediction logic.

Some key detector implementations include Cv2_Dnn which uses OpenCV's DNN module and a pre-trained ResNet SSD model. It handles preprocessing images to the expected input size, normalizing values, and running inference via the model's forward() method. The MTCNN detector loads Keras models for the PNet, RNet and ONet subnetworks and runs them sequentially to generate initial proposals, refine the candidates, and output final detections. Another implementation is S3fd which loads a pretrained Keras model from an .h5 file. It defines the model architecture in model_definition() and handles preprocessing inputs via prepare_batch() before prediction.

All detectors must implement the extractor interface defined in Detector to receive input batches via get_batch() and return final detections through finalize(). Key preprocessing steps include resizing images, normalization, and optionally applying rotations. Postprocessing involves filtering detections by confidence thresholds, removing small faces, and converting to standard DetectedFace objects. Detector options like these thresholds are configured via plugin-specific defaults files to integrate them into the GUI.

Face Alignment

References: plugins/extract/align, lib/align

The core functionality of aligning detected faces and finding facial landmarks is implemented in the …/align directory. This directory contains different implementations of face alignment algorithms, with the goal of accurately detecting facial landmarks for faces detected in input images or video frames.

The main implementations are:

  • …/cv2_dnn.py uses OpenCV's DNN module and a pre-trained TensorFlow model to detect 68 facial landmarks. The Align class handles initializing the model, preprocessing input faces, predicting landmarks, and postprocessing results.

  • …/fan.py loads a Keras implementation of the Fan algorithm for landmark detection. The Align class inherits from Aligner and overrides methods like init_model(), process_input(), predict(), and process_output() to load the Fan model and perform alignment, cropping, prediction, and post-processing of landmarks.

These specific aligner implementations leverage different models but share a common base interface and processing pipeline defined in …/_base. The Aligner base class in …/aligner.py handles preprocessing faces via methods like _normalize_faces(), running predictions via _predict(), and postprocessing results with _process_output(). It provides robustness techniques like averaging predictions from multiple model runs via _process_output().

The …/processing.py file contains classes like AlignedFilter and ReAlign that are used by the Aligner class. AlignedFilter implements filtering of poorly aligned faces, while ReAlign handles re-aligning faces through the aligner again via methods that queue and retrieve batches while transforming results back to the original coordinates.

Face Masking

References: plugins/extract/mask

The key classes for generating masks from aligned faces are Masker and subclasses that inherit from it like Mask. Masker establishes the common interface and workflow that all mask generation plugins must implement.

The core steps are:

  1. Loading the mask generation model via init_model(). Plugins may load a neural network via KSession, or initialize without a model for direct landmark processing.

  2. Preprocessing input faces via process_input(). This typically compiles faces into a normalized batch, like resizing and mean subtraction.

  3. Generating predictions via predict(). Neural network plugins simply call the model, while others directly process landmarks. For example, Mask in components.py splits landmarks into components with parse_parts() and draws convex hull masks.

  4. Postprocessing outputs via process_output(). This typically decompiles the batch result back into individual face masks.

Some key implementation details:

Face Recognition

References: plugins/extract/recognition

The main face recognition functionality in Faceswap is implemented in the …/recognition directory. This directory contains plugins for extracting face embeddings and associating faces with identities.

The core plugin uses the VGGFace2 model for recognition. The Recognition class in …/vgg_face2.py handles initializing the pre-trained VGGFace2 Keras model using KSession. It defines functions for preprocessing input images by subtracting the mean pixel value with process_input().

Face embeddings are extracted for a batch of faces using predict(), which simply calls the model's own predict() method. These embeddings are high-dimensional vectors that encode identity features from the faces.

The Cluster class performs clustering on the embeddings to associate them with identities. It is initialized with properties like the linkage algorithm to use. The main clustering logic occurs in _do_linkage(), which calls the appropriate linkage function. This returns a linkage matrix encoding the relationships between samples.

_seriation() then recursively sorts the embeddings based on the linkage tree. It also handles optionally binning indices if an identity distance threshold is provided. The sorted indices or binned tuples are returned by __call__(), which kicks off the sorting process.

This clustering associates each embedding with a predicted identity, allowing faces from the same person to be grouped together for further processing. The identities are then paired back to each detected face by the base Identity class in …/_base.py. This provides the core face recognition functionality in Faceswap.

Pipeline Orchestration

References: faceswap

The core class that orchestrates the extraction pipeline is the Extractor class defined in …/pipeline.py. The Extractor initializes and launches the necessary detector, aligner, and other extractor plugins to run the full pipeline. It determines the optimal processing order and batching of inputs to efficiently fit within available GPU memory constraints.

The Extractor initializes plugins by calling methods like _load_detect() and _load_align(). It determines the processing flow order via _set_flow() and batch sizes for each phase with _set_phases() based on available GPU memory retrieved from _get_vram_stats(). The Extractor provides properties like input_queue to interface with the pipeline, and leverages classes like ExtractMedia to pass data between pipeline stages. Its goal is to run the full extraction workflow as quickly as possible within hardware limitations.

The _get_vram_stats() method queries the GPUStats class to retrieve statistics on available GPU memory. It returns a dictionary of devices and their free memory. The Extractor uses this information to dynamically determine the optimal batching and processing order. For example, it may run detection on smaller batches if GPU memory is constrained, while using larger batches for alignment which is less memory intensive.

The ExtractMedia class handles passing data like images and detections between different pipeline stages like detection and alignment. It initializes MultiThread to run plugin processing in parallel threads to optimize throughput. The MultiThread class encapsulates running plugins on their own CPU threads to avoid blocking the main extraction process.

Base Classes

References: faceswap

The base classes provide common interfaces and logic that are inherited by extraction plugins. This includes standardizing the plugin lifecycle and API.

The abstract Extractor base class defined in …/_base.py is central to the extraction pipeline. It handles common initialization logic in methods like __init__() and init_model(). Crucially, it establishes a standardized plugin API through abstract methods like process_input(), _predict(), and process_output(). This API promotes extensibility by requiring plugins to implement these methods, while also maximizing performance by providing shared preprocessing/postprocessing pipelines.

The Aligner base class in …/aligner.py plays a similar role for alignment plugins. It handles tasks like preprocessing faces via _normalize_faces(), making predictions via _predict(), and postprocessing landmarks in _process_output(). The Aligner base class provides robustness techniques like refeeds via _get_adjusted_boxes() and averaging predictions via _process_output() that are inherited by aligner plugins.

The _base modules also contain important classes like AlignedFilter and ReAlign that are shared across aligners. AlignedFilter implements filtering of poorly aligned faces. ReAlign handles re-aligning faces through the aligner again via methods that queue and retrieve batches while transforming results back to the original coordinates. These classes encapsulate logic that is reused by multiple plugins.

The base classes define common lifecycles, interfaces, and preprocessing/postprocessing pipelines that extraction plugins inherit. This promotes extensibility, standardization, and code reuse across the plugin implementations.

Training Models

References: plugins/train, lib/model

Defining model architectures and training workflows to learn face swapping involves constructing neural network models and optimizing their parameters to perform tasks like face swapping. The core functionality covered includes defining model architectures using neural network layers and blocks, implementing training loops and workflows to optimize model parameters, and configuring options for the training process.

Some key aspects covered are:

  • The ModelBase class in …/__init__.py defines the base interface all model plugins must implement via abstract methods like load(), train(), and predict(). This promotes standardization across models.

  • Model classes like Model in …/original.py inherit from ModelBase and override methods to define model-specific architectures. The build_model() method constructs the full model by assembling encoder, decoder, and other sub-components.

  • Encoder and decoder networks are defined using neural network layers and blocks. For example, …/original.py implements the encoder using Conv2DBlock layers, and decoders use UpscaleBlock layers.

  • The TrainerBase class in …/_base.py defines the core training loop functionality. Its train_one_step() method runs batches through the model, logs metrics, saves checkpoints, and updates previews.

  • Trainer plugins like Trainer in …/original.py inherit from TrainerBase to leverage this functionality while implementing model-specific workflows.

  • Configuration options for models and trainers are defined using metadata dictionaries in files like …/original_defaults.py. This drives the GUI and validation.

Model Architectures

References: plugins/train/model

The core functionality of this section is to define the neural network architectures that can be used for various types of models in Faceswap, such as AutoEncoders, GANs, etc. Key classes and functions handle constructing the model architectures from code.

The Model classes defined in files like …/dfaker.py, …/iae.py, and …/dfl_sae.py are responsible for building the actual model architectures. These classes inherit from ModelBase and override methods like build_model(), encoder(), and decoder() to define the network layouts.

The encoder() methods focus on encoding input data like images into latent representations. Different implementations use techniques such as convolutional blocks with increasing filters to downsample inputs. The decoder() methods mirror the encoders to decode latent vectors back to the original data dimensions, using techniques like upsampling blocks.

Some common network components across models include Conv2DBlock functions for convolutional layers, UpscaleBlock functions for upsampling layers, and Conv2DOutput layers for final outputs. The Model classes assemble encoder, decoder, and other sub-networks into complete autoencoder or GAN architectures.

Key files also define common neural network building blocks. For example, …/nn_blocks.py contains functions like Conv2DBlock() and UpscaleBlock() that are reused across multiple model implementations. This promotes code reuse.

The Model classes play the primary role in defining model architectures programmatically. Their methods assemble pre-existing network components into complete trainable models. Common blocks abstract away layer implementations, while the class interfaces ensure a consistent way to define new model types.

Training Workflows

References: plugins/train/trainer

The TrainerBase class in …/_base.py defines the core training workflow functionality that is inherited by trainer plugins. It implements a single training loop via the train_one_step() method. This method runs a batch of training data through the model, calculates the loss, logs metrics to Tensorboard, saves checkpoints if needed, and generates preview images.

The _Samples class handles compiling preview images from model predictions during training. It uses cv2.resize() to resize samples to the model input size, calls model.predict() to get predictions from the model, overlays the samples on backgrounds to compile the previews, and optionally applies a mask overlay using cv2.addWeighted().

The _Timelapse class generates a timelapse of training by periodically calling _Samples to generate preview samples from a subset of training data. It saves these timelapse previews to disk over the course of training.

The _get_config() method loads and merges the configuration for the trainer, allowing options to be overridden on a per-model basis. This provides flexibility in how each model is trained.

Trainer plugins like …/original.py inherit from TrainerBase to implement specific models. The core training loop logic is contained within TrainerBase, while plugins focus on model-specific aspects.

Configuration

References: faceswap

The core configuration functionality in Faceswap is managed through the Config class defined in …/_config.py. This class handles initializing all default configuration options for training models with Faceswap. It loads configuration values from the plugin folder it is initialized with, ensuring each plugin has an isolated set of configurable options.

The Config class provides an important method:

  • get_config() retrieves the entire configuration as a dictionary.

Configuration values like "batch-size" control aspects of training like batch processing. The Config class then provides a consistent interface for reading and updating these options throughout training via get_config().

This centralized approach to managing configuration promotes modularity - each plugin can define its own isolated set of options loaded from its folder, while code interacts with a standardized configuration interface. The Config class encapsulates validation, default values, and help text generation for a plugin's options.

Utilities

References: faceswap

The core training utilities implemented in Faceswap include functionality for preview generation, logging metrics, and other tasks important for monitoring and visualizing model training progress.

Some key aspects covered by utilities include:

The Preview classes initialize a PreviewBuffer to store samples in __init__(). The update() method retrieves the latest batch and displays it, calling _process_batch() to update the buffer.

The LearningRateFinder class implements an algorithm to automatically find optimal learning rates. It trains the model with increasing rates in find() to plot the loss curve and find rates where loss decreases fastest.

The Cache class handles efficient shared access to detected face data via thread-safe caching. The private _Cache class uses a dictionary and lock to cache faces in memory based on their hash for quick retrieval.

Base Classes

References: faceswap

The base classes provide common interfaces and logic that plugins inherit from to implement core Faceswap functionality in a standardized way. This includes abstract base classes for models and trainers.

The ModelBase class acts as the primary interface that model plugins must inherit from. It defines crucial lifecycle methods like __init__() for initialization and train() to encapsulate the core training loop. The ModelBase implementation is important because it enforces a common interface for all model plugins to integrate as reusable components.

Similarly, the TrainerBase class defines the standard interface for training loop plugins. Subclasses implement methods to define custom training workflows.

Some key implementation details:

  • The ModelBase class defines crucial properties like the model name, input/output shapes. It also contains methods like load(), train(), save() that plugins must implement to integrate model loading/saving functionality.

  • The TrainerBase class defines the overall training loop structure. Its train_one_step() method encapsulates one iteration of: data preprocessing, model training on a batch, logging metrics, and saving checkpoints if needed. Subclasses implement this method to define the actual training logic.

Core Model Components

References: lib/model

The core model components in Faceswap provide fundamental building blocks for constructing neural networks used in tasks like face swapping, generation, and other computer vision applications. Key elements include layers, blocks, losses, and initialization routines.

Some important core components include:

Some key implementation details:

Losses

References: lib/model/losses

The …/losses directory contains several custom loss functions that can be used for training neural networks in Faceswap. These losses aim to match both pixel-level errors as well as high-level perceptual qualities between generated and target images.

Key loss functions implemented include LPIPSLoss for perceptual loss based on deep neural network features, FocalFrequencyLoss to prioritize harder frequencies in the Fourier domain, GeneralizedLoss to smoothly interpolate between L1 and L2 loss, GradientLoss to compare image gradients, and LaplacianPyramidLoss to match multi-scale image features.

The LossWrapper class allows combining multiple loss functions with masking and weighting. This provides flexibility to mix pixel-level and perceptual losses.

Some important classes include DSSIMObjective, which computes the DSSIM loss via gaussian convolutions. It extracts luminance/contrast measures between images via operations like depthwise convolutions. The LDRFLIPLoss class implements the more complex LDR-FLIP loss, filtering images with _SpatialFilters and extracting color/feature differences. The MSSIMLoss class recursively downsamples images to compute the MS-SSIM loss across scales.

Networks

References: lib/model/networks

The …/networks directory contains implementations of common neural network architectures that can be used for modeling tasks in faceswap. It provides pre-built network definitions that can be instantiated and leveraged without needing to define new architectures.

The …/__init__.py file acts as an entry point, importing common network classes so they can be easily accessed for modeling. It collects the AlexNet and SqueezeNet classes from …/simple_nets.py which define those CNN architectures as TensorFlow models using reusable blocks. The file also imports the ViT classes from …/clip.py which provide Vision Transformer models.

The …/simple_nets.py file contains the AlexNet and SqueezeNet classes. AlexNet implements the standard AlexNet CNN topology by chaining convolutional blocks. SqueezeNet similarly constructs its architecture from reusable fire modules.

The …/clip.py file provides components for vision Transformers and ResNet models. The Transformer class handles self-attention blocks using multi-head attention. The VisualTransformer takes images, extracts patches, and learns embeddings which are then processed by a Transformer encoder.

Conversion Pipeline

References: plugins/convert, scripts

The conversion pipeline handles running inference on faces using the trained Faceswap model, and then blending the predicted faces back into the original frames. This involves several key steps:

The DiskIO class, defined in …/fsmedia.py, is responsible for loading input frames and detected faces from disk into queues. It uses background threads via the MultiThread class to efficiently load files from disk in parallel.

The Predictor class, defined in …/convert.py, loads the trained Faceswap model. It reads faces from the input queue populated by DiskIO and passes them through the model in batches.

The Converter class also defined in …/convert.py then processes the predicted faces. It applies any selected post-processing plugins from …/convert to the faces before patching them back onto the original frames. Plugins handle tasks like color adjustments and mask blending to composite the predicted face naturally into the frame.

The core Convert class orchestrates the overall conversion process.

The …/convert directory contains plugins that implement key parts of the conversion pipeline like color adjustments, mask blending, and output writing. The Adjustment base classes in files like _base.py define common interfaces that plugins inherit from. Classes like Color and Mask contain algorithms to implement functionality.

The …/fsmedia.py file contains classes that help load input media and alignments. The Images class handles loading frames, while Alignments handles loading and saving alignment data with options like skipping existing frames.

Mask Blending

References: plugins/convert/mask

The Mask class in …/mask_blend.py handles blending predicted masks onto target faces during conversion. It supports different mask types like predicted, stored, or dummy masks. The run() method takes detected face data and applies various processing steps to produce a final blended mask.

The key steps are:

  • The _get_mask() method retrieves the correct raw mask based on the configured type. This could be a dummy mask, predicted mask processed via _process_predicted_mask(), or a stored mask from the detected face via _get_stored_mask().

  • _get_stored_mask() retrieves the stored mask from the detected face data, resizes it if needed, and applies any blurring specified in the config.

  • _process_predicted_mask() applies any post-processing like blurring to a predicted mask tensor, as defined in the config.

  • The _get_box() method creates a base "box" mask, applying Gaussian blur to the edges based on config options like kernel_size.

  • The raw mask and box mask are combined to produce the final blended mask, which is then returned.

Configurations for the Mask class are defined in …/mask_blend_defaults.py. This specifies options like the blending type, kernel size, thresholds, and their metadata.

Color Adjustment

References: plugins/convert/color

The core plugins for color adjustment in Faceswap are located in …/color. Several plugins are provided that implement different algorithms for adjusting the colors of the new face to better match the lighting and skin tone of the original face. Key plugins include:

  • Color_Transfer: Implements the color transfer algorithm described in Reinhard et al. It converts faces to Lab color space, calculates statistics for each channel, and scales the new face channels based on the statistics to transfer colors between faces.

  • Match_Hist: Matches the histograms of each color channel between the old and new faces. It constructs histograms over the masked regions, finds the cumulative distribution functions, and uses linear interpolation between CDFs to match the distributions between channels.

The plugins inherit from the _base.Adjustment base class in …/_base.py. This class provides common configuration loading and defines the interface plugins must implement via the process() method, which takes the old/new faces and mask and returns the adjusted new face. It also handles mask insertion/removal via run().

The key implementation for Color_Transfer is in the Color class defined in …/color_transfer.py. The process() method converts the faces to Lab color space using cv2.cvtColor(), calls image_stats() to calculate channel statistics, scales the new face channels based on these statistics, and converts back to BGR for the color transfer.

Match_Hist implemented in …/match_hist.py constructs histograms over the masked regions of each channel, finds the CDFs, and performs linear interpolation between CDFs on each channel value to match the distributions between faces.

Scaling Adjustment

References: plugins/convert/scaling

The Faceswap plugin system includes scaling adjustments that can be applied to faces after they have been warped to the target frame during conversion. The …/scaling directory contains implementations of these adjustments.

The core abstraction for scaling adjustments is the Adjustment base class defined in …/_base.py. This class provides a common interface and handles tasks like configuration loading and preprocessing/postprocessing of faces. All scaling adjustment plugins must inherit from Adjustment and implement the process() method.

An important scaling adjustment is sharpening, which can enhance fine details in swapped faces. The Scaling class in …/sharpen.py handles sharpening. It defines methods like box(), gaussian(), and unsharp_mask() that implement different sharpening techniques. The process() method applies one of these methods based on the configured method, along with the amount and radius settings to control strength.

The get_kernel_size() static method calculates the kernel size and center point for a given radius percentage. Sharpening techniques convolve the face with a kernel, so this method ensures the kernel is sized appropriately.

Default configuration options for sharpening are defined in …/sharpen_defaults.py, including settings like method, amount, radius, and threshold. This file ensures these options are properly configured in Faceswap's user interface and configuration system.

Output Writing

References: plugins/convert/writer

The faceswap conversion plugins handle writing the final output of the converted frames to common formats like video, images, and GIF. The …/writer directory contains several plugins that each implement a Writer class to handle writing to a specific format.

The FFmpeg writer in …/ffmpeg.py caches converted frames and uses imageio_ffmpeg to write them to a video file. It initializes the writer with options like codec, tune, and audio handling. The write() method caches incoming frames, while _save_from_cache() flushes the cache to the video periodically.

The GIF writer in …/gif.py inherits from _base.Output and handles writing frames to an animated GIF. It resizes frames to a common dimension and caches them in order, then uses imageio to continuously write cached frames to the GIF.

The OpenCV writer in …/opencv.py pre-encodes frames to bytes for faster writing than Pillow or ImageIO. The Writer class handles configuration, pre-encoding images via _encode_image(), and writing image and optional mask files using OpenCV encoding functions.

The Pillow and imageio writers similarly implement Writer classes that override methods like write(), pre_encode(), and close() to support the specific encoding and writing needs of each library.

All writer plugins inherit from and leverage functionality in the _base.Output class for common initialization, filename handling, frame caching, and optional pre-encoding of images to speed up writing. Configuration options for each plugin specify parameters like format, quality, and other settings.

Patch Extraction

References: faceswap

Extracting face patches and transformation matrices is handled by the Patch class defined in …/alignments.py. The Patch class encapsulates the logic for extracting face patches from detected faces stored in the alignments file, as well as calculating the transformation matrices needed to align faces to a normalized reference frame.

Some key responsibilities of the Patch class include:

  • Loading detected faces and their associated alignment data from the Alignments class
  • Iterating through each detected face and extracting the face patch image
  • It calculates the transformation matrix needed to align the extracted face patch to the reference frame
  • All extracted face patches and transformation matrices are stored in a dictionary and returned by the Patch class

So in summary, the Patch class provides a clean interface for extracting face patches and transformations from detected faces stored in the alignments file. It encapsulates the necessary logic while iterating through each face and storing the results. This data is then used downstream during conversion to properly align and blend faces back into the original frames.

Pipeline Orchestration

References: faceswap

The …/convert.py script orchestrates the full conversion pipeline. It initializes important objects like the DiskIO class, which handles loading images and detected faces from disk into queues via its load() method. The DiskIO uses background threads for efficient I/O via MultiThread to load data asynchronously without blocking the main thread.

The Predictor class loads the trained model and handles predicting face embeddings via its predict() method. It feeds faces into the model in batches for efficiency using multiple GPUs if available. The Converter class applies any selected post-processing plugins to the predicted faces via its process() method before patching them back onto the original frames.

The classes work together, passing data between their queues to parallelize the processing workflow. The …/convert.py script coordinates launching these classes to optimize GPU usage. It initializes the GPUStats class from …/gpu_stats to retrieve statistics on available GPU devices. Based on these stats like free memory, it determines the optimal batch size and processing order when initializing the DiskIO, Predictor, and Converter classes. This allows efficiently utilizing all available GPUs and balancing load between them to maximize throughput.

The key DiskIO class handles loading data efficiently from disk. It initializes the MultiThread class to launch background loading threads. The load() method queues paths to load, while the threads populate the output queues by calling load functions.

The Predictor class manages feeding batches of faces to the model via its predict() method. It handles batching faces and passing them to the model. Based on GPU stats, it determines the optimal batch size that fits in GPU memory.

The Converter class applies selected plugins to process predictions. It initializes plugins via their plugin loader classes. It feeds predictions through the plugin pipelines which call plugin methods internally to complete each processing step.

Configuration

References: faceswap

The …/_config.py file handles configuration for the convert plugins. It initializes default settings that will be used by the converter plugins unless overridden in the main Faceswap configuration file. This allows each plugin to have its own isolated set of configurable options.

The main class is the Config class defined in …/_config.py. Its initialize() method recursively loads the default configuration values from files in the plugin's subdirectories.

Each plugin's configuration file defines the options for that plugin through the Config class. Options have help text, types, ranges, and default values assigned. This provides validation and documentation for GUI generation.

The Config class centralizes loading and accessing these options through getter properties. Plugins also have access to the configuration for determining preprocessing logic based on settings.

This configuration system allows each plugin to have its own set of isolated, validated, and documented options defined in its subdirectory. The Config class loads and exposes these options through a consistent interface to both the GUI and converter plugins.

Base Classes

References: faceswap

The abstract base classes defined in Faceswap provide common interfaces that standardize implementation across different algorithms and ensure a consistent plugin API. This promotes extensibility and allows new plugins to integrate smoothly.

In the convert plugins directory …/convert, the _base.py files define important base classes that all adjustment plugins must inherit from. For example, in the color adjustment plugins directory …/color, the file _base.py contains the Adjustment base class. This class handles common initialization logic via __init__().

The process() method takes the detected faces as input. The implementation in each color adjustment plugin performs the necessary preprocessing steps on the faces. It then applies the core color adjustment algorithm differntly based on the specific technique, whether it be average color matching, histogram matching, or another approach. After getting the color adjusted result, optional postprocessing may be applied. Finally, the adjusted faces are returned from process().

A similar approach is taken in the scaling adjustment plugins directory …/scaling, where the base class Adjustment in _base.py again provides a common interface. It requires child classes to implement the core scaling adjustment logic within process(). The plugin class in …/sharpen.py provides one implementation, applying sharpening techniques like box blurring, gaussian blurring, or unsharp masking based on configurations.

By defining these base classes, the plugins are standardized on a common interface while allowing flexible implementation of different algorithms. New adjustment techniques can be added by subclassing Adjustment and implementing the abstract process() method accordingly. This promotes extensibility of the overall convert pipeline.

Tools and Utilities

References: tools

The tools directory contains utilities for manipulating key assets like alignments, models, and previews outside of the main Faceswap pipeline. This allows batch processing tasks and inspection of data outside the core tools.

The main subdirectories provide functionality for:

  • …/alignments: Tools for working with facial alignment data, including checking frames/faces, sorting alignments, and applying spatial smoothing over time. The Alignments class handles batch processing and launching individual jobs like Check, Sort, Spatial.

  • …/effmpeg: Provides the Effmpeg class which offers an object-oriented wrapper for common FFmpeg operations through a consistent interface. This allows tasks like extracting frames via extract(), generating videos via gen_vid(), resizing media to be run from the CLI.

  • …/model: Contains classes like Model and NaNScan for loading models, inspecting weights for issues with NaNScan, and restoring models from backups.

  • …/preview: Implements the live preview tool GUI classes like Preview, ConfigTools, Samples to test converter settings in real-time. Samples loads preview faces and Patch applies settings on a background thread via MultiThread, decoupling loading from real-time updates.

The key implementation details are:

  • Alignments handles argument parsing and batch processing individual jobs in parallel processes or threads.

  • Effmpeg maps use cases to FFmpeg commands via action methods. It validates inputs and handles errors.

  • Classes generally separate concerns into distinct responsibilities like I/O, processing logic, and interfaces to provide flexibility and code reuse.

Alignments

References: tools/alignments

The …/alignments directory contains tools for manipulating facial alignment data. The main functionality is provided by the Alignments class in …/alignments.py.

Alignments handles running individual alignment tasks or batch processing of multiple jobs. It validates arguments and initializes jobs by finding the alignments file with _find_alignments() and instantiating classes like Check to run the selected task. These job classes perform the core alignment operations and are implemented in files under …/jobs.py.

The MediaLoader base class in …/media.py provides common functionality for loading different media types from disk efficiently. The Faces class inherits from this and handles iterating through faces, reading embedded alignment data using read_image_meta_batch(), and detecting duplicate faces. It also updates legacy faces lacking data with update_legacy_png_header().

Some key job classes are:

The Frames and Faces classes efficiently load frames and faces, and handle updating legacy data formats. This provides the tools with a clean interface to manipulate facial alignment data from various sources.

Preview

References: tools/preview

The core functionality of the Preview tool is to allow live previewing of face swaps by obtaining sample faces, running predictions, applying converter settings, and dynamically updating the preview display in real-time as the user tweaks settings. This is achieved through several key classes that coordinate processing faces and updating the GUI.

The Samples class in …/preview.py is responsible for obtaining semi-random faces from input media using the …/alignments library. It runs these faces through the predictor model to get initial predictions.

The Patch class in …/preview.py then takes the predicted faces from Samples and runs them through the full conversion pipeline by applying the selected settings from the GUI. It does this processing on a background thread to not block the GUI.

The ConfigTools class in …/control_panels.py handles loading the converter configuration file format and provides a unified interface to read and write settings between the GUI and file.

When the user interacts with widgets to change settings like color adjustment or mask type, callbacks registered with patch_callback trigger the Patch class to re-apply the conversion pipeline to dynamically update the preview faces.

The FacesDisplay class in …/viewer.py is responsible for compiling the original and converted faces into a single image grid for unified display. It extracts faces from frames, applies alignments, and arranges them into rows.

The ImagesCanvas class then displays this grid image within a resizable Tkinter canvas, scaling the image to fit the canvas dimensions on window resizes.

This allows the key aspects of live previewing - obtaining sample faces, applying predictions, and tweaking settings to see immediate previews of the results in a user-friendly GUI interface.

Manual Editor

References: tools/manual

The Faceswap manual editor tool allows users to manually annotate faces in images and videos. This includes tasks like drawing bounding boxes around faces, placing landmarks on facial features, editing face masks, and adjusting extraction regions. The tool provides a graphical interface for these tasks.

The core classes for implementing the manual editor functionality are Manual, Aligner, and DetectedFaces. The Manual class acts as the main GUI window, initializing Tkinter and loading the frames. It manages different GUI elements like the video display canvas. The Aligner class handles running face alignment models like FAN in background threads to generate landmarks and masks on demand. DetectedFaces acts as the central manager of all face detection data, loading data from alignments files and saving edits back.

Key functionality includes interactively editing different annotation types on frames. The FrameViewer canvas displays the background video and annotations. Editors like BoundingBox, Landmarks, ExtractBox, and Mask inherit from the base Editor class and allow adjusting their respective annotations on faces. Mouse handling methods allow dragging boxes, points, or painting masks directly on the canvas.

Coordinate transformations between the displayed view and original frame values are important when faces are scaled or offset. Methods like _scale_to_display() and scale_from_display() in the base Editor handle this. The AlignedFace class also ensures landmarks scale correctly during zooming.

On each edit, the underlying DetectedFaces data is directly updated. This synchronizes changes to the GUI and saves results back to alignments files. The manual editor thus provides a full interactive workflow for annotating faces in images and video.

Effmpeg

References: tools/effmpeg

The …/effmpeg directory provides an object-oriented Python wrapper for common FFmpeg video operations like extraction and muxing. The Effmpeg class acts as the main interface, taking command line arguments and processing them to run various FFmpeg commands.

The Effmpeg class handles parsing input from the EffmpegArgs class in …/cli.py. It sets up DataItem objects from …/effmpeg.py to store media paths and metadata for inputs, outputs, and references. Methods like extract(), gen_vid(), and mux_audio() define FFmpeg arguments and call __run_ffmpeg() to execute the commands.

__run_ffmpeg() runs FFmpeg via the FFmpeg class from ffmpy. It takes the defined inputs/outputs and handles errors. The DataItem class represents media items, setting attributes like path and name from the file extension using set_type_ext() and set_name().

EffmpegArgs provides the CLI interface, returning supported arguments as dictionaries. It handles argument parsing, validation, and help documentation. Special parsing functions handle types like time formats.

In summary, this wrapper provides object-oriented access to FFmpeg functionality with validation, metadata handling, and error reporting for common video operations through the Effmpeg interface and CLI arguments. The separation of concerns enables mapping use cases to FFmpeg commands.

Sort

References: tools/sort

The …/sort directory contains tools for sorting faces based on different attributes like blurriness, pose, color histograms, etc. The main classes used for sorting are SortMethod subclasses defined in …/sort_methods.py and …/sort_methods_aligned.py. These subclasses implement specific scoring and sorting logic for attributes like blur, color, pose angles, size, etc.

The Sort class in …/sort.py handles initializing and running the overall sorting process. It handles parsing arguments from the SortArgs class in …/cli.py, which defines the command line interface. Sort uses the appropriate SortMethod subclass based on the arguments to score, sort and bin faces. It supports batch processing of folders and logging changes.

The main steps are:

  1. A SortMethod subclass like SortBlur is instantiated
  2. score_image() scores each face's attribute as it's loaded by InfoLoader
  3. Faces are sorted in-place by score using methods like sort()
  4. Sorted faces can be grouped into bins via binning()

Sort processes images, copying or moving them to new locations. It handles argument parsing and delegates attribute extraction to reusable SortMethod subclasses. These provide a common interface for implementing different sorting criteria.

Mask

References: tools/mask

Masks can be generated from existing alignment data or directly from input faces using the Mask tool. The Mask class in …/mask.py orchestrates the overall masking process. It handles argument parsing from MaskArgs in …/cli.py to configure inputs, outputs, and the masker model.

Faces and frames are loaded lazily from disk using FacesLoader and ImagesLoader classes defined in …/mask.py. This optimizes memory usage for large datasets. The selected Extractor plugin, such as Masker in …/mask, is initialized to generate the actual masks.

The Extractor is fed inputs in a background thread using MultiThread from …/mask.py to avoid blocking. It processes each input separately to prevent memory leaks. Processed masks are used to update the alignments file managed by Alignments from …/alignments.py or face image headers.

Mask previews can be saved by ImagesSaver in …/mask.py as full frames, individual faces, or composites. Legacy faces without alignments are also supported. The Mask tool handles all processing, input/output, and runs the selected masker model.

Model

References: tools/model

The …/model directory contains functionality for loading, inspecting, and restoring models. The key aspects are the NaNScan and Model classes in …/model.py.

The NaNScan class loads a model using load_model() and then uses get_weights() to extract the weights from each layer and submodel. It checks for NaN or infinite values using these weights, and returns any errors in a nested dictionary structure. This allows inspecting a model for invalid values.

The Model class initializes jobs based on the command line arguments passed to …/cli.py. It standardizes the TensorFlow environment setup before initializing the appropriate job class. The main job classes are Inference for creating an inference-only model copy, and Restore for restoring from a backup. Restore performs the actual restore operation.

User Interface

References: lib/gui, lib/cli

The Faceswap GUI and command line interface are implemented through cooperation between several key classes and modules. The main CommandNotebook class in …/command.py handles displaying the command notebook tab, which contains the console output and allows running commands. It subclasses ttk.Notebook and manages adding and removing pages for running processes.

Individual command tabs are built using the CommandTab class. CommandTab builds each tab by pulling options from the config and displaying them using a ControlPanel.

The ControlPanel class defined in …/control_helper.py is used to display GUI options for commands.

Argument parsers for each command are defined by extending the FaceSwapArgs base class in …/args.py. This class parses the arguments and if validation passes, the arguments are passed to the ScriptExecutor to run the associated script module.

The ControlPanel class provides a consistent interface for building option controls. Each ControlPanelOption encapsulates properties of an individual control like its variable and formatting. ControlPanel stores the built GUI widgets and exposes methods to retrieve them.

GUI Implementation

References: lib/gui

The …/__init__.py file contains the main GUI functionality for Faceswap. It imports several important classes used to build the GUI, including CommandNotebook for the command notebook, DisplayNotebook for displaying images/video, CliOptions for command line options, MainMenuBar and TaskBar for menu and taskbar functionality, LastSession for saving/loading session data, and various utility functions like get_config, get_images, initialize_config, initialize_images, and preview_trigger. It also imports the ProcessWrapper class which is used to run processes in the background.

The CommandNotebook class (…/command.py) is used to display the command notebook tab, which contains the console output and allows running commands. It subclasses ttk.Notebook and manages adding/removing pages for running processes.

The DisplayNotebook class (…/display.py) subclasses ttk.Notebook and is used to display the images/video notebook tab. It manages adding/removing pages for preview images or video output.

The CliOptions class (…/options.py) parses command line arguments and configurations. It provides methods for getting options values.

The MainMenuBar class (…/menu.py) creates the main menu bar for the GUI.

The TaskBar class (…/menu.py) creates the task bar docked at the bottom and allows starting/stopping tasks.

The LastSession class (…/project.py) handles saving GUI state on close and loading the last session.

The ProcessWrapper class (…/wrapper.py) is used to run long-running processes like training or conversion in the background.

The utility functions like get_images() and preview_trigger() manage common tasks like loading preview images and triggering previews on changes.

The main functionality is built from these key classes like CommandNotebook, DisplayNotebook, and ProcessWrapper along with utilities for options, menus, images and session handling.

The Config class (…/_config.py) centralizes GUI configuration. Its set_globals() method defines all configuration sections, items and defaults. This provides a single source of truth for GUI settings.

CLI Implementation

References: lib/cli

The …/cli directory contains code related to implementing the command line interface and argument parsing for Faceswap. The main components are the ScriptExecutor class in …/launcher.py, which is responsible for loading and executing the appropriate script module based on the command passed to Faceswap, and the classes in …/args.py that define parsers for each command's arguments.

The ScriptExecutor class handles the core business logic of the CLI. Its _import_script() method imports the relevant script module from scripts based on the command. The execute_script() method then performs important setup tasks like setting environment variables and configuring GPUs if excluded, runs the imported script, and handles exceptions and cleanup.

Argument parsers for each command are defined by classes that extend FaceSwapArgs in …/args.py. For example, ExtractArgs defines arguments for the extract command by overriding get_argument_list() to return a list of argument dictionaries. These parsers are constructed when commands run.

The file also contains utilities like the SmartFormatter class that customizes help formatting. Argument validation and suppression utilities ensure consistent argument processing.

The FullHelpArgumentParser extends argparse and is used to always show full help on errors, providing helpful usage information to the user.

This implementation centralizes argument definition and parsing, loads scripts dynamically, and abstracts away execution details. It provides a unified and extensible interface for the Faceswap CLI.

Core Library

References: lib

The core functionality of the Faceswap library code includes common logic shared across different parts of the application. This includes key components like:

  • The …/serializer.py file contains classes for serializing and deserializing data to different formats like JSON, YAML, pickle, and NumPy arrays. The Serializer base class defines a common interface, while subclasses like _JSONSerializer and _PickleSerializer implement specific serialization formats.

  • The …/config.py file handles loading and accessing configuration values. The Config class manages default values, types, and validation for options. Child classes can extend this to add configuration sections for tools and plugins.

  • Logging and error handling is implemented in …/logger.py. The FaceswapLogger class adds custom logging levels and formats messages. Log files, streams, and crash reports are set up.

  • Common utilities are provided in …/utils.py. This includes functions for downloading required models, determining which GPU backend to use, debugging with timers, and setting verbosity levels.

  • The …/git.py file contains the Git class which provides an object-oriented wrapper for common Git operations like checking the status, branch, and remotes.

  • Multithreading primitives are defined in …/multithreading.py. Classes like FSThread and MultiThread make it easy to run functions in parallel threads while catching errors. The BackgroundGenerator runs generators asynchronously.

Now I will discuss some important classes and functions in more detail:

The Serializer base class in …/serializer.py defines the common interface for serialization with methods like _marshal() and _unmarshal(). Subclasses like _JSONSerializer and _PickleSerializer override these to implement specific serialization formats. _JSONSerializer uses json.dumps() and json.loads() while _PickleSerializer uses pickle.dumps() and pickle.loads(). _NPYSerializer handles NumPy arrays by saving to BytesIO first before writing, and loading directly from BytesIO.

The Config class in …/config.py centralizes configuration. Its constructor initializes options from plugins and loads/validates values. The config_dict property accesses values, casting them to correct types. Child classes can extend this class to add their own configuration sections.

The FaceswapLogger class in …/logger.py adds the VERBOSE and TRACE levels via logging.addLevelName(). It defines methods like verbose() and trace() to log at these levels. The _file_handler(), _stream_handler() and _crash_handler() methods set up file, stream and crash log handlers. The RollingBuffer buffers debug lines for crash reporting.

Alignments

References: lib/align

The …/align directory contains code for loading and manipulating facial alignment data. The core data structures used are AlignedFace, Alignments, and DetectedFace.

AlignedFace represents an aligned face, storing attributes like the image, landmarks, and pose estimate. It provides methods like transform_image() for applying affine transformations.

Alignments acts as the main interface for alignment data. It loads and saves data from serialized ".fsa" files using the _IO class. Properties like data and methods like get_faces_in_frame() provide access to the alignment information.

DetectedFace stores attributes of a detected but unaligned face like the bounding box and optional mask. It can load an aligned face using AlignedFace.

The _IO class handles loading and saving alignments to/from the ".fsa" file. It checks for legacy formats on initialization. The Thumbnails class manages low-res thumbnail faces stored in the file.

Several updater classes check if the alignments file requires a data format update. For example, _FileStructure checks landmark format changes. This ensures backwards compatibility when loading older files.

AlignedFace is central to the alignment process. It takes landmarks and optionally a face image. The _FaceCache class is used to efficiently cache properties like landmarks and pose estimates between method calls to avoid recomputing. Methods like transform_image() apply affine transformations based on the cached properties.

Models

References: lib/model

Neural network models, sessions, backups and snapshots are handled through the KSession class defined in …/session.py. KSession acts as a wrapper for Keras models, initializing with details like the model name and path. It handles loading models with load_model() within the proper GPU/CPU context using _set_session(). Methods like predict() and define_model() ensure the model is used consistently within this context when making predictions or defining a new model.

The Backup class in …/backup_restore.py handles backing up models to .bk files on each save using backup_model(). It takes snapshots of the full model folder periodically with snapshot_models() by copying it to a new folder named with the iteration count. Backup also restores models from their backups with restore(). It first moves existing model files to an "archived" folder, then restores each file matching a .bk backup by renaming it. Logs are also restored up to the last backup. Other methods help with tasks like checking filename validity and retrieving necessary metadata for restoring.

The KSession class provides the main interface for models. Upon initialization, it stores model details and sets the computation context with _set_session(). This method handles GPU/CPU configuration using TensorFlow. The load_model() method loads the model within this context. For TensorFlow, it also calls make_predict_function() to make prediction thread-safe. predict() provides predictions by calling the model's predict() method within the context. It supports both single examples and batches. define_model() defines a new model within the context by calling a function that returns the model.

The Backup class centralizes backup/restore/snapshot logic through careful file management and validation of filenames, paths and metadata. snapshot_models() copies the entire model folder to a new "snapshot" folder named with the iteration count, ensuring only relevant files are copied. restore() first moves existing model files to an "archived" folder, then restores each file matching a .bk backup by renaming it. It also restores logs up to the last backup. Methods like _check_valid() and _get_session_names() provide necessary validation and metadata retrieval.

Training

References: lib/training

Training handles key aspects of model training in Faceswap like data augmentation, caching faces for efficient access, generating training batches, and previewing progress. The core classes that power these functions are defined in …/training.

The ImageAugmentation class in …/augmentation.py is responsible for performing various augmentations on batches of training images like color adjustments in the LAB color space, random transformations using OpenCV warping, random horizontal flipping, and landmark-based warping. It initializes constants for the augmentations and caches them for efficiency.

The _Cache class in …/cache.py provides a thread-safe mechanism for caching face detection and alignment data during training. It stores cached data for each side in private dictionaries, handling issues like different extraction versions. Methods like get_items() allow read-only access to the cached data.

The Feeder class coordinates the entire data generation process. It loads TrainingDataGenerator and PreviewDataGenerator instances from …/generator.py to handle fetching training batches and generating previews. The TrainingDataGenerator class performs key steps like color augmentation, random transformations, flipping, cropping, and random warping on batches during training.

The PreviewBuffer in …/__init__.py buffers training stats in a thread-safe manner. Classes like PreviewTk and PreviewCV then display the buffered previews to visualize training progress.

GPU Stats

References: lib/gpu_stats

The …/gpu_stats directory contains Python modules for collecting GPU statistics and information in a backend-agnostic way. It defines a common interface for GPU stats classes via the _GPUStats base class, which is subclassed differently for each backend.

The _GPUStats class provides initialization logic, properties to access stats like device_count and sys_info, and empty method stubs. Subclasses like NvidiaStats fill these out by initializing the relevant GPU API (PyNVML) and calling it to retrieve stats. CPUStats provides dummy CPU stats when no GPU is detected.

/__init__.py dynamically imports the right backend subclass based on platform. This allows getting stats uniformly across backends without conditional logic. For example, it imports NvidiaAppleStats on macOS with Nvidia GPUs.

The main subclasses are:

  • NvidiaStats initializes PyNVML and uses it to get the device count, names, driver version, and VRAM info by calling APIs like _get_device_count().

  • AppleSiliconStats initializes Metal and queries TensorFlow to get stats for the Apple SoC, handling unified memory. Methods like _get_vram() split RAM evenly among SoCs.

  • ROCm searches the sysfs filesystem at …/rocm.py to get AMD GPU paths, then queries files to get stats like VRAM usage.

  • DirectML handles DirectX querying via the Device class defined in …/directml.py. Device represents a GPU.

GPUInfo and subclasses represent the collected stats, with get_card_most_free() finding the best GPU. set_exclude_devices() allows excluding some. Logging is configurable.

CLI

References: lib/cli

The …/cli directory contains code related to command line argument parsing and script loading for Faceswap. The key components are the ScriptExecutor class in …/launcher.py and classes that extend FaceSwapArgs in …/args.py.

ScriptExecutor is responsible for loading and executing the appropriate script module based on the command passed to Faceswap. Its __init__ method stores the command, while _import_script() imports the relevant script module from scripts folder. execute_script() performs setup like configuring the GPUs and environment, runs the imported script, and handles cleanup.

Argument parsers for each command are defined by classes like ExtractArgs that extend FaceSwapArgs in args.py. FaceSwapArgs contains methods like get_argument_list() that child classes override to define their arguments. Each child class returns a list of dictionaries defining command-specific arguments.

When a command runs, an instance of the corresponding parser class like ExtractArgs is created. This parses sys.argv and if validation passes, calls ScriptExecutor to run the script. ScriptExecutor abstracts away loading specific scripts so new ones can be added easily. It has thorough error handling and checks dependencies.

GUI

References: lib/gui

The core GUI functionality is implemented through several key classes defined in …/gui. The MainMenuBar class handles building the main menu bar and adding standard menus like File, Edit, View. It leverages functions in …/menu.py to populate menu items and assign commands.

The TaskBar class dynamically builds project and task buttons along the bottom of the GUI by calling functions like _loader_and_kwargs() and _set_help() to retrieve the correct command callbacks and help text for each button type. This allows adding new button types in the future without much code changes.

The CommandNotebook class subclasses ttk.Notebook and manages adding/removing pages for running processes. It is used to display the command notebook tab, which contains the console output and allows running commands.

The DisplayNotebook class also subclasses ttk.Notebook and handles displaying the images/video notebook tab. It manages adding/removing pages for preview images or video output dynamically as tasks change.

The Config class centralizes important configuration objects and state needed across the GUI, including the Tkinter root object and GUI styling. Its constructor initializes these objects, and getter properties provide centralized access.

The GlobalVariables class defines global Tkinter variables like those for display text and running tasks via properties. These can then be accessed and modified from any part of the GUI.

The ProcessWrapper class bridges the GUI and command line interfaces by building CLI arguments, launching processes in the background, and updating the GUI based on process state. It uses functions like get_config() and get_images() to retrieve shared data.

Overall, these classes work cooperatively to provide the core GUI functionality through reusable abstractions. The menus, buttons, notebooks, configuration, variables, and process handling functionality are all implemented through these key classes and shared utilities.

Automated Installation

References: .install

The .install directory contains scripts that automate installing dependencies and setting up Faceswap on Linux and macOS systems. This streamlines the process and ensures Faceswap and its dependencies are correctly configured.

The …/linux and …/macos directories each contain a Bash script that handles the installation process for the respective operating system. On Linux, …/faceswap_setup_x64.sh downloads and sets up Miniconda to manage environments. It then creates a Conda environment, activates it, installs Git, clones Faceswap from GitHub, and runs the setup script. This script uses several functions to modularize the process:

On macOS, …/faceswap_setup_macos.sh checks for dependencies like cURL and Xcode. If needed, it installs XQuartz. It then finds or installs Miniconda, creates an environment, gets user options via prompts, reviews the options, and installs Faceswap. Functions like check_file_exists(), download_file(), and ask() modularize the process.

These scripts automate common tasks, prompt for options, validate inputs, download necessary files, remove existing folders, and activate the correct environment/packages. This provides guided, automated installation of Faceswap and all dependencies for Linux and macOS.

Linux Installation

References: .install/linux

The …/linux directory contains scripts to automate the installation of Faceswap on Linux systems. The main script is …/faceswap_setup_x64.sh, which handles the entire installation process from start to finish.

The script begins by prompting the user to select installation options via the user_input() function. This includes directories for Conda, Faceswap code, and whether to create a desktop launcher. The faceswap_opts() and review() functions then handle validating and reviewing the selected options.

It proceeds to download and set up Miniconda for managing Python environments and dependencies. The create_env() function is used to create a Conda environment with the specified name and version of Python. This environment is activated using activate_env().

The clone_faceswap() function clones the latest Faceswap code from the GitHub repository into the selected directory. Any remaining requirements are installed by running the Faceswap setup script with setup_faceswap().

For hardware support, the user selects either CPU or GPU during input collection. The setup script configures Faceswap accordingly.

Finally, an optional desktop launcher can be created which runs activate_env() and launches the Faceswap GUI using create_gui_launcher().

By abstracting the complex installation steps, handling inputs and options, downloading necessary files, and activating the correct environment - this script provides an automated, one-command process to install Faceswap and all its dependencies on Linux.

macOS Installation

References: .install/macos

The …/macos directory contains scripts and functions to automate the installation of Faceswap on macOS systems. The main script is …/faceswap_setup_macos.sh, which handles all steps of the installation process.

This script first checks for required dependencies like cURL and Xcode using functions like check_file_exists() and check_folder_exists(). If any dependencies are missing, the user is prompted to install them. The script then checks for an existing Conda installation, and if none is found, it downloads and installs Miniconda using download_file().

The find_conda_install() function searches common locations to discover existing Conda installations. If none are found, Miniconda is installed and set_conda_dir_from_bin() is used to parse the path to the Miniconda binary and set the CONDA_DIR environment variable.

A Conda environment is created using the create command and activated. The Faceswap repository is then cloned, and the setup script run to complete the installation process. User input is gathered throughout using prompts created by ask() and ask_yesno(). At the end, review() displays the installation options to the user for confirmation before proceeding.

System commands like mkdir, git clone, and create are executed by the shell script to perform the installation steps.

Documentation

References: docs

The docs directory contains documentation for the Faceswap project. It uses Sphinx to generate API documentation and user guides from code docstrings and markdown files. Sphinx parses docstrings and automatically generates summaries of classes and functions for the API reference. It also processes markdown files to build user guides on key topics.

The …/conf.py file contains Sphinx configuration. It sets the FACESWAP_BACKEND environment variable and inserts the project path for importing modules. The MOCK_MODULES list mocks dependencies that are not needed for documentation. Project metadata like the name, author, and copyright are specified. Extensions are configured for features like Napoleon docstring parsing and autosummary generation. Templates, excludes patterns, and the HTML theme are also set.

Sphinx parses docstrings using the Napoleon extension. It generates automatic summaries for classes and functions referenced in code with the autosummary extension. Markdown files define user guides that are processed by Sphinx. When documentation is built with Make or Make.bat in the docs directory, Sphinx processes all source files and generates the final API reference and guides in HTML format.

The conf.py file handles all necessary configuration for Sphinx. It focuses the documentation on explaining code rather than implementation details by mocking unneeded dependencies and automatically summarizing classes and functions. Markdown guides provide documentation on key topics at a higher level. Together, these tools and files allow Sphinx to generate comprehensive yet concise API and user documentation from Faceswap's codebase and supplemental guides.

Testing

References: tests

The tests directory contains unit and integration tests for validating components in Faceswap. At the core, it uses the Pytest framework for writing and executing tests against the codebase.

Some key aspects of the testing approach include:

  • Unit testing individual classes and functions to validate specific pieces of logic in isolation. This includes testing classes like MediaLoader which handles loading media files.

  • Integration testing how components interact by testing end-to-end functionality. An example is …/simple_tests.py which runs a full extraction, training, and conversion pipeline on sample data.

  • Leveraging Pytest fixtures and parametrization to dry run tests with different arguments, inputs, and configurations. This helps cover more code paths.

  • Isolating components using mocking to remove external dependencies during unit testing. For example mocking out GPU calls for testing _GPUStats.

  • Validating expected outputs along with error conditions and edge cases. Ensuring failures are detected and handled appropriately.

Some important directories for testing include:

  • …/lib contains tests for core library modules like models, alignments, GUI functionality.

  • …/tools tests utilities and standalone tools used in Faceswap.

Key implementation details:

Unit Testing

References: tests/lib, tests/tools

The Faceswap tests directory tests contains extensive unit tests for individual software components like functions, classes, and modules. Unit testing is important for validating core functionality and catching bugs early in development.

A primary goal of the unit tests is to achieve complete code coverage by testing each part of the system in isolation. Tests are separated by concern into subdirectories like …/lib for core library modules and …/tools for utilities. Comprehensive testing helps ensure all new code meets expectations and existing code continues functioning as intended with changes.

Some key areas of focus for unit tests include:

Overall, the unit tests provide thorough validation of individual components across the system. Well-designed tests help prevent regressions and ensure high code quality as Faceswap continues to evolve.

Integration Testing

References: tests/simple_tests.py, tests/startup_test.py

The Testing framework in Faceswap utilizes integration tests to validate interactions between components and end-to-end functionality. Integration tests are located in tests and leverage the Pytest test framework.

Key integration tests include those in …/simple_tests.py and …/startup_test.py. The tests in simple_tests.py perform end-to-end extraction, sorting, training and conversion tests on sample data to check for crashes or hangs. This validates the overall pipeline. startup_test.py contains tests that run on startup to validate the Keras backend and version.

The run_test() function in simple_tests.py is used to execute individual steps in the pipeline as subprocesses. It takes a name and command list, runs the command and prints pass/fail results. Any errors are caught and failures counted. Sample data is downloaded using the download_file() function if needed. Configuration is updated using sed via the set_train_config() function.

startup_test.py contains two key integration tests - test_backend() and test_keras(). test_backend() creates a Keras variable and checks that the module it comes from matches the backend returned by get_backend(). test_keras() validates the Keras version matches expectations for the given backend. Both tests are parameterized to run for each backend using @pytest.mark.parametrize.

Test Framework

References: tests

The Pytest framework is used for writing and executing unit tests in Faceswap. Pytest makes it easy to write tests and run them across different Python environments and system configurations.

The main features used from Pytest include:

  • pytest.mark.parametrize - This decorator allows running the same test function multiple times with different arguments. This avoids duplicating test logic and allows testing classes and functions under different configurations.

  • pytest.raises - Used to assert that a code block or function call raises a specific exception. This validates error handling behavior.

Key implementation details:

  • Test functions are defined with descriptive names like test_backend() which makes it clear what each test is validating.

  • Functions contain the core logic for validating functionality by isolating classes and mocking dependencies.

  • Mocking is used extensively to isolate classes and functions from external dependencies.

  • Tests are data-driven where possible to cover different configurations.

  • Assertions clearly validate the expected outputs match reality to catch any regressions.

Core Functionality

References: tests/lib

The core library modules tested include models, alignments, and the GUI. Testing these modules is critical to ensure the foundational components of Faceswap function as intended.

The …/model directory contains extensive tests for the deep learning components powering Faceswap's models. Key elements tested include neural network layers like QuickGELU and ReflectionPadding2D, losses such as GMSDLoss and LInfNorm, and optimizers like AdaBelief. Tests validate layers integrate properly into Keras, losses guide models as expected, and optimizers successfully optimize models. The layer_test function builds small models to test layers, while losses are tested by training models on random data and ensuring targets are reached.

Tests in …/stats focus on the GUI's ability to display training performance from TensorBoard logs. The central _Cache stores parsed events from the _EventParser. The _CacheData class handles updates, while the TensorBoardLogs class manages interactions. Tests create mock data and inputs to validate classes cache, retrieve, and update metrics as expected both from stored logs and live updates. This ensures accurate real-time monitoring of model training.

Models

References: tests/lib/model

This section covers testing of key components used in Faceswap's deep learning models, including layers, losses, blocks, and optimizers. Thoroughly testing these elements is crucial to validate the core model functionality and prevent regressions.

The tests for layers in …/layers_test.py utilize the layer_test() function to evaluate custom layers like GlobalMinPooling2D and QuickGELU. This function generates sample inputs, builds models using the layer, makes predictions, and ensures outputs and serialization are as expected.

Loss functions in …/losses_test.py are tested using test_loss_output() to verify shapes and data types. The LossWrapper combines losses and is tested with test_loss_wrapper(). Losses include GeneralizedLoss, GMSDLoss and standard Keras functions.

Neural network blocks from NNBlocks like Conv2DBlock and UpscaleBlock are rigorously tested in …/nn_blocks_test.py with the block_test() function. This generates inputs, runs blocks in Keras models, and checks outputs match actual values. Block configurations are controlled by parameters.

Initializer classes ICNR and ConvolutionAware are tested in …/initializers_test.py to ensure weights are set correctly. The _runner() function checks statistics match expected values.

The AdaBelief optimizer is tested for training performance in …/optimizers_test.py using _test_optimizer(). Serialization and constraints are also validated.

Alignments

References: faceswap

The …/layers_test.py file contains tests for custom Keras layers defined in Faceswap like QuickGELU and ReflectionPadding2D. The layer_test() function validates layer configuration, call behavior, and serialization by building small models with the layers. This ensures layers integrate properly into Keras and train as expected.

The …/losses_test.py file tests loss functions. It contains tests like test_loss_wrapper() which validates the LossWrapper class for combining multiple losses works correctly during training. Loss tests are important as losses guide the model toward the desired output.

The _Cache class provides a thread-safe mechanism for caching face data using private dictionaries and locks. It handles issues like extraction version mixing. The tests initialize the _Cache class with different data and validate data can be correctly set, retrieved, and updated from the cache as expected.

The MediaLoader class is responsible for loading media like images and videos from disk. Its main methods include stream(), which iterates over the loaded media, and load_image(), which loads individual images or frames. The tests validate that it handles initialization correctly based on the given path, and outputs data as expected from its loading methods.

The tests provide thorough validation of these important classes by initializing them with different input data, making changes, and asserting the expected output is produced. This helps ensure the core functionality works reliably across different use cases.

GUI

References: tests/lib/gui

The TensorBoardLogs class ties together the key classes for displaying training metrics from TensorBoard event logs in the Faceswap GUI. It manages the _LogFiles class, which handles locating and tracking TensorBoard log files in a given directory. _LogFiles provides methods to retrieve specific log files.

TensorBoardLogs also manages the central _Cache class. _Cache stores all parsed event data retrieved from the _EventParser. It caches this data and provides methods like get_loss() and get_timestamps() that are called by the GUI to display metrics over time.

Under the hood, _Cache relies on _CacheData objects to store the actual cached values for each unique training session. _CacheData handles updating its data whenever new events are processed.

All event parsing and metric extraction is handled by the _EventParser class. It handles live updating of metrics by continuously checking for new log files from ongoing training runs. _EventParser uses its _process_event() method to iterate over the raw events in each file and extract values like loss, timestamps, and other metadata. These extracted values are then cached by _Cache for retrieval by the GUI.

The unit tests in …/event_reader_test.py validate that this entire process works correctly. This includes tests for each individual class - _LogFiles, _CacheData, _Cache - as well as end-to-end tests of the full TensorBoardLogs workflow. Mock event data and training logs are used to simulate real inputs and validate that the expected metrics can be retrieved from the cached data. Tests also confirm live updating functionality when new events are available.

Tools

References: tests/tools

The key utilities tested under Tools are those related to media loading, alignments extraction, and the preview viewer.

The …/alignments directory contains tests for classes that comprise the core media loading and alignments extraction pipeline. The MediaLoader class handles loading images and videos from disk, making frames accessible via its stream() and load_image() methods. The Frames class extracts frames from loaded media using process_frames(). Face alignment data is loaded using the Faces class and its read_image_meta_batch() method. Finally, the ExtractedFaces class extracts face thumbnails for a given frame using the alignments data and its get_faces() method. These classes are crucial for preprocessing media before training or conversion.

Tests in …/preview validate the classes used to build the live face preview viewer. The _Faces dataclass stores metadata for each face, including the filename, alignment matrix, and cropped image. The FacesDisplay class is responsible for compositing the faces into a single preview image. It handles setting properties, cropping faces from frames, drawing borders and labels, and scaling faces via methods like _build_faces_image() and _get_scale_size(). The ImagesCanvas displays this composite image and ensures it is updated when the canvas resizes. These tests help ensure the preview functionality works reliably.

The tests covering these utilities provide thorough validation of key components. They help maintain expected behavior as the code evolves and catch bugs early.

Alignments

References: tests/tools/alignments

The …/media_test.py file contains unit tests for validating media loading and face extraction functionality. The main classes tested are MediaLoader, Frames, Faces, and ExtractedFaces.

MediaLoader handles loading images and videos from disk. Its stream() method iterates over media, while load_image() loads individual frames. The tests validate initialization, extension checking, and output of loading methods.

Frames extracts frames from images and videos. Its process_frames() method extracts frames from images, and process_video() extracts frames from videos. The tests validate the expected frame data structures.

Faces loads face alignment data from itxt files using read_image_meta_batch(). It handles legacy data formats and duplicate detection. The tests validate loading of valid and invalid metadata, as well as sorting of face data.

ExtractedFaces extracts face thumbnails based on alignments data. Its get_faces() method extracts thumbnails for a given frame. The tests validate initialization, extraction for different input cases, and ROI sizing.

Mocking is used extensively to isolate classes from dependencies. Tests validate expected outputs and error conditions. The tests provide thorough validation of the media loading and face extraction pipeline through integration and unit testing of key classes.

Preview

References: tests/tools/preview

The key classes tested for preview functionality are _Faces, FacesDisplay, and ImagesCanvas.

_Faces is a dataclass that stores metadata about the faces being displayed, such as filenames, alignment matrices, and cropped face images. The test__faces function confirms this dataclass initializes its attributes correctly.

FacesDisplay is responsible for building the composite image of faces to display. It handles setting properties like column count and face size with methods like set_display_dimensions() and set_centering(). The _build_faces_image() method combines the source and destination faces into a single image, calling _header_text() to add labels and _draw_rect() to draw borders. _crop_source_faces() and _crop_destination_faces() extract the aligned faces from the full frames. _faces_from_frames() updates the faces data from source/destination frames. _get_scale_size() scales the faces to fit the display dimensions. update_tk_image() builds the Tkinter photo image for display.

ImagesCanvas displays the composite face image. It ensures the face display is updated when the canvas size changes by reloading itself, so the preview always reflects the latest data.

The tests pass in different column counts and face sizes to ensure these classes work under various configurations. Mock objects are used in place of Tkinter and OpenCV. The tests confirm the expected behavior, such as updating only the destination faces. This helps ensure the preview viewer functions reliably.

Coverage

References: tests

The tests aim to provide complete coverage of all code in Faceswap through thorough unit and integration testing. Key testing functionality includes:

The tests directory contains tests that validate core functionality and components without external dependencies. Tests in …/lib focus on important modules like models, GUI, and system information.

The test_backend() and test_keras() functions in …/startup_test.py perform sanity checks during startup by validating the Keras backend and version.

The _test_optimizer() function in files like …/optimizers_test.py trains a simple model using the optimizer being tested to ensure it reaches the expected accuracy. This validates optimizers work correctly.

Files such as …/layers_test.py use the layer_test() function to build models with layers, train on data, and check outputs match expectations. This thoroughly tests layers.

The block_test() function in …/nn_blocks_test.py generates input, passes it through blocks using the Keras functional API, and ensures outputs match expected values. It also serializes and deserializes models to test this functionality.

The test_loss_output() and test_loss_wrapper() functions in …/losses_test.py check loss functions return the proper output shape and that the LossWrapper class combines losses correctly.

The run_test() function in files like …/simple_tests.py runs commands and prints pass/fail results to detect crashes and hangs. Integration tests validate end-to-end functionality.