logoAuto Wiki by


Auto-generated from karpathy/micrograd by Auto Wiki

GitHub Repository
Written inJupyter Notebook
Watchers 130
Last updated2024-01-06
Auto Wiki
Generated at2024-01-06
Generated fromCommit c91140

Micrograd is a library for building and training neural networks with automatic differentiation. It allows users to define computational graphs made up of tensor-like objects, then efficiently compute gradients via backpropagation.

At the core is an Autograd Engine that handles building computation graphs and running backpropagation. Classes represent scalar values that participate in graphs. Calling methods on instances performs operations and returns new objects, while also inserting nodes into the graph. After running the forward pass, calling methods recursively calls gradient computation methods on each node in reverse topological order to compute gradients via chain rule.

Built on top of this engine, Micrograd provides a Neural Network Building API for easily constructing neural network layers and models. Classes in …/ allow users to define models by stacking layers of neurons. The forward pass applies each layer in sequence.

Overall, Micrograd allows imperative-style Python code to implicitly construct computational graphs behind the scenes. It handles all the graph manipulation and gradient calculations automatically via operator overloading and topological sorting. This enables an easy-to-use API for building and training neural networks.

Autograd Engine

References: micrograd

The file …/ defines functionality for automatic differentiation. It implements the backward pass, topologically sorting nodes using depth-first search, then calling callbacks in reverse order to backpropagate gradients through the entire computational graph. This allows efficiently calculating gradients via chain rule.

Neural Network Building

References: micrograd

The …/ file defines classes for building neural networks in a high-level, object-oriented way.

The file builds multi-layer perceptrons by stacking objects. It initializes each layer by passing parameters. It then applies each layer sequentially using function composition. This allows easily building neural networks with multiple hidden layers in a modular way.

The class represents a single layer. It is composed of objects and aggregates their outputs. In the constructor, it creates objects, passing the input size. It returns the concatenated outputs, performing a linear transformation.

The class represents an individual unit in a layer. In its constructor, it initializes weights and bias. It performs operations on the inputs, adding the bias, and optionally applying an activation.

By separating concepts into distinct classes, the file provides a reusable way to define architectures without low-level operations. The modular design allows composing networks by stacking pre-defined types.

Testing Autograd Engine

References: test

The test suite in test contains test cases that validate the gradient calculations performed by the autograd functionality in micrograd. These tests are critical to ensure the correctness of the automatic differentiation which is core to building and training neural networks with micrograd.

The main test file is …/ which implements unit tests for the operations. These tests construct equivalent computational graphs using a class and PyTorch tensors. They run the forward pass to compute outputs from both implementations, then call ```python backward()


A variety of operations are tested including addition, multiplication, and the ReLU activation. This tests a range of functionality exercised by the autograd. These tests validate the core algorithms and implementations used to automatically compute gradients in `micrograd`.


References: micrograd

The micrograd directory contains code for building and training neural networks. The main class for representing nodes is defined in …/

The class stores the scalar value of a node as well as its gradient. Operations are implemented via operator overloading methods, which return new instances representing the results. These methods also set the attribute to define how to propagate gradients through that node.

The method implements backpropagation by topologically sorting the graph nodes using depth-first search. It then calls each node's method in reverse order to efficiently calculate gradients via backpropagation. This allows computing gradients for any node in the graph.

The …/ file defines classes for building neural networks. The class represents a single neuron, taking inputs and weights as arguments. Its method performs the weighted sum and optionally applies the activation.

The class represents a fully connected layer, composed of instances. It aggregates the outputs of the neurons. The class builds a multi-layer perceptron by stacking objects and applying each one sequentially.

In summary, the class forms the fundamental unit for constructing computational graphs and performing automatic differentiation. The neural network classes provide a high-level API for easily defining and composing complex models out of these basic building blocks.



Micrograd is installed via pip. The core functionality involves defining computational graphs and calculating gradients using an autograd engine. This engine dynamically builds the computation graph behind the scenes as operations are applied to tensor objects.

A computational graph is represented. Individual operations are encapsulated by objects. When the method is called on a node, it implements backpropagation to efficiently calculate gradients.

A high-level neural network API is provided in the …/ module. Models are built by initializing layers with objects and composing them into a sequential model. This model defines the forward pass. Gradients can then be calculated and used for optimization by initializing a loss function like and applying gradient descent via an optimizer such as.

Key functionality:

  • Operations on tensor-like objects insert nodes into the
  • The's method implements backpropagation
  • Neural network models are built with objects and sequential model
  • Losses like provide the training signal
  • Optimizers such as minimize the loss via gradient descent


References: micrograd

The file …/ contains tests for the core autograd functionality. It uses a class defined in …/ to represent scalar values that can implicitly build computational graphs through operations.

The tests construct equivalent graphs using tensors, run the forward pass to compute outputs, then perform backpropagation. Key operations tested include addition and multiplication. This validates the core differentiation logic for elementary operations on scalar values.

The …/ file defines the class. Methods return new instances, implicitly constructing the computational graph. A method propagates gradients through the graph via reverse-mode automatic differentiation.

The …/ file builds upon the engine to define classes for neural networks. A class represents a single neuron, taking inputs and weights to compute its output. A class aggregates neuron outputs, allowing construction of multi-layer models. These provide a high-level interface for defining neural networks.

Neural Network Training

References: micrograd/

The file …/ defines classes for building neural networks.

A base class defines functionality including a method to zero gradients of parameters, and a method to retrieve parameters.

A class represents a single neuron. It takes inputs and an optional flag in its initializer. Weights are initialized randomly, and bias is initialized.

A class represents a layer, composed of objects. It creates objects and aggregates their outputs. This allows building fully connected layers.

A class builds a multi-layer perceptron by stacking objects. It initializes layers and applies each layer sequentially. This allows building neural networks of multiple layers.

Neural networks can be trained by:

  1. Constructing an instance with the desired architecture
  2. Calling on inputs to make predictions
  3. Calculating loss between predictions and targets
  4. Calling a method to calculate gradients
  5. Calling a method to update parameters
  6. Repeating steps 2-5 over many batches of training data

Package Metadata

References: micrograd

The file defines the metadata needed to distribute Micrograd. This includes attributes like the package name and version.

It uses standard functionality to define the basic metadata for distribution. By following conventions, this allows tasks like installing from PyPI and provides reference information.


The file contains code to define metadata and functionality for packaging and distributing the Micrograd library as a reusable Python package. It uses a function to define important metadata like the package name, version, author, description, and other classifiers.

The function accepts keyword arguments that define the core metadata needed to distribute the package. This includes the package name, version number, author and author email, a description of what the package does, and the long description which provides more details and is read from a README file.

The function also specifies what modules and packages are included in the distribution. For Micrograd this would include the micrograd package containing the core functionality.

By defining this configuration metadata, it allows tasks like installing Micrograd and provides key reference information to users about the package author, purpose, and requirements. Following conventions ensures Micrograd can be easily distributed and installed as a standard Python package.


References: micrograd

The file documents the Micrograd package. It includes a section on getting started.

The documentation demonstrates how to define computational graphs and calculate gradients. Examples show training a classifier by initializing a model, defining a loss, and optimizing.

By centralizing documentation in the README, it provides users with an introduction to using the library for common tasks like defining and training basic neural networks.