Lucid² 💎¶

Lucid is a minimalist deep learning framework built entirely from scratch in Python. It provides a pedagogically rich environment to explore the foundations of modern deep learning systems—including autodiff, neural network modules, and GPU acceleration—while remaining lightweight, readable, and free of complex dependencies.

Whether you’re a student, educator, or an advanced researcher seeking to demystify deep learning internals, Lucid delivers a transparent and highly introspectable API that faithfully replicates key behaviors of major frameworks like PyTorch, yet in a form simple enough to study line by line.

How to Install¶

Basic Installation¶

Install via PyPI:

pip install lucid-dl

Alternatively, install the latest development version from GitHub:

pip install git+https://github.com/ChanLumerico/lucid.git

Enable GPU (Metal / MLX Acceleration)¶

If you are using a Mac with Apple Silicon (M1, M2, M3), Lucid supports GPU execution via the MLX library.

To enable Metal acceleration:

Install MLX:
```
pip install mlx
```
Confirm you have a compatible device (Apple Silicon).
Run any computation with device=”gpu”.

Verification¶

Check whether GPU acceleration is functioning:

import lucid
x = lucid.ones((1024, 1024), device="gpu")
print(x.device)  # Should print: 'gpu'

Tensor: The Core Abstraction¶

At the heart of Lucid is the Tensor class—a generalization of NumPy arrays that supports advanced operations such as gradient tracking, device placement, and computation graph construction.

Each Tensor encapsulates:

A data array (ndarray or mlx.array)
A gradient buffer (grad)
The operation that produced it
A list of parent tensors from which it was derived
A flag indicating whether it participates in the computation graph (requires_grad)

Construction and Configuration Example:

from lucid import Tensor

x = Tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True, device="gpu")

Setting requires_grad=True adds the tensor to the autodiff graph.
Specifying device=”gpu” allocates the tensor using the Metal backend.

Switching Between Devices¶

Tensors can be moved between CPU and GPU at any time using the .to() method:

x = x.to("gpu")  # Now uses MLX arrays for accelerated computation
y = x.to("cpu")  # Moves data back to NumPy

Inspect the device of a tensor with:

print(x.device)  # Either 'cpu' or 'gpu'

Automatic Differentiation (Autodiff)¶

Lucid implements reverse-mode automatic differentiation, which is especially efficient for computing gradients of scalar-valued loss functions.

It builds a dynamic graph during the forward pass, capturing every operation involving tensors that require gradients. Each node in the graph stores a custom backward function that computes local gradients and propagates them upstream using the chain rule.

Computation Graph Internals:

Each Tensor acts as a node in a Directed Acyclic Graph (DAG).
Operations create edges between inputs and outputs.
Each tensor’s _backward_op defines how to compute gradients with respect to its parent tensors.

The backward method:

Topologically sorts the computation graph.
Initializes the output gradient (typically 1.0).
Executes all backward operations in reverse order.

Example:

import lucid

x = lucid.tensor([1.0, 2.0, 3.0], requires_grad=True)
y = x * 2 + 1
z = y.sum()
z.backward()
print(x.grad)  # Output: [2.0, 2.0, 2.0]

This chain-rule application computes the gradient \(\frac{\partial z}{\partial x} = \frac{\partial z}{\partial y} \cdot \frac{\partial y}{\partial x} = [2, 2, 2]\).

Hooks & Shape Alignment¶

Lucid supports:

Hooks for inspecting or modifying gradients.
Shape broadcasting and matching to handle nonconforming tensor shapes.

Metal Acceleration (MLX Backend)¶

Lucid supports Metal acceleration on Apple Silicon devices using the MLX library. This integration enables tensor operations, neural network layers, and gradient computations to run efficiently on the GPU by leveraging Apple’s unified memory and neural engine.

Key Features:

Tensors with device=”gpu” are allocated as mlx.core.array.
Core mathematical operations, matrix multiplications, and backward passes leverage MLX APIs.
The API remains unchanged; simply use .to(“gpu”) or pass device=”gpu” to tensor constructors.

Basic Acceleration Example:

import lucid

x = lucid.randn(1024, 1024, device="gpu", requires_grad=True)
y = x @ x.T
z = y.sum()
z.backward()
print(x.grad.device)  # 'gpu'

GPU-Based Model Example:

import lucid.nn as nn
import lucid.nn.functional as F

class TinyNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(100, 10)

    def forward(self, x):
        return F.relu(self.fc(x))

model = TinyNet().to("gpu")
data = lucid.randn(32, 100, device="gpu", requires_grad=True)
output = model(data)
loss = output.sum()
loss.backward()

Warning

When training models on GPU using MLX, you must explicitly evaluate the loss tensor after each forward pass to prevent the MLX computation graph from growing uncontrollably. MLX defers evaluation until necessary; if evaluation is not forced (e.g. by calling .eval()), the graph may grow too deep, leading to performance issues or memory errors.

Recommended GPU Training Pattern:

loss = model(input).sum()
loss.eval()  # Force evaluation on GPU
loss.backward()

Neural Networks with lucid.nn¶

Lucid provides a modular, PyTorch-style interface for building neural networks via the nn.Module class. Users define model classes by subclassing nn.Module and assigning parameters and layers as attributes. Each module automatically registers its parameters, supports device migration via .to(), and integrates with Lucid’s autodiff system.

Custom Module Definition Example:

import lucid.nn as nn

class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.fc1(x)
        x = nn.functional.relu(x)
        x = self.fc2(x)
        return x

Parameter Registration:

model = MLP()
print(model.parameters())

Moving to GPU:

model = model.to("gpu")

Training & Evaluation¶

Lucid supports training neural networks using standard loops, customized optimizers, and tracking gradients across batches of data.

Full Training Loop Example:

import lucid
from lucid.nn.functional import mse_loss

model = MLP().to("gpu")
optimizer = lucid.optim.SGD(model.parameters(), lr=0.01)

for epoch in range(100):
    preds = model(x_train)
    loss = mse_loss(preds, y_train)
    loss.eval()  # Force evaluation

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    print(f"Epoch {epoch}, Loss: {loss.item()}")

Evaluation without Gradients:

with lucid.no_grad():
    out = model(x_test)

Loading Pretrained Weights¶

Lucid supports loading pretrained weights for models using the lucid.weights module, which provides access to standard pretrained initializations.

from lucid.models import lenet_5
from lucid.weights import LeNet_5_Weights

# Load LeNet-5 with pretrained weights
model = lenet_5(weights=LeNet_5_Weights.DEFAULT)

You can also initialize models without weights by passing weights=None.

Educational by Design¶

Lucid isn’t a black box—it’s built to be explored. Every class, function, and line of code is crafted to be readable and hackable.

Build intuition for backpropagation.
Modify internal operations to experiment with custom autograd.
Benchmark CPU vs GPU behavior with your own models.
Debug layer by layer, shape by shape, and gradient by gradient.

Conclusion¶

Lucid serves as a powerful educational resource and a minimalist experimental sandbox. By exposing the internals of tensors, gradients, and models—and integrating GPU acceleration—Lucid invites users to see, touch, and understand how deep learning truly works.

Others¶

Dependencies: NumPy, MLX, openml, pandas

Inspired By: