yolo_v1_tiny¶

lucid.models.yolo_v1_tiny(num_classes: int = 20, **kwargs) → YOLO_V1¶

The yolo_v1_tiny function constructs a lightweight YOLO-v1 object detection model based on the simplified YOLO-v1 architecture. It reduces the depth and width of the convolutional backbone to improve inference speed, while retaining the single-stage detection strategy.

Total Parameters: 236,720,462 (ConvNet + FC)

Function Signature¶

@register_model
def yolo_v1_tiny(num_classes: int = 20, **kwargs) -> YOLO_V1

Parameters¶

num_classes (int, optional): Number of object classes to detect. Default is 20 (PASCAL VOC).
kwargs (dict, optional): Additional arguments to override defaults in YOLO_V1, such as:
- split_size (int): Grid size for dividing the input image (default: 7).
- num_boxes (int): Number of bounding boxes per grid cell (default: 2).
- lambda_coord (float): Weight for coordinate loss (default: 5.0).
- lambda_noobj (float): Weight for no-object confidence loss (default: 0.5).

Returns¶

YOLO_V1: An instance of the YOLO-v1-tiny model ready for training or inference.

Examples¶

Basic Usage

from lucid.models import yolo_v1_tiny

# Create YOLO-v1-tiny model with 20 target classes
model = yolo_v1_tiny(num_classes=20)

# Input: batch of images with shape (N, 3, 448, 448)
x = lucid.rand(8, 3, 448, 448)

# Output: tensor of shape (N, 7, 7, 30) for VOC (20 classes, 2 boxes)
preds = model(x)

print(preds.shape)  # (8, 7, 7, 30)

Training Notes¶

The output shape of the model is:

(N, S, S, 5 * B + C)

Where:

S is the grid size (split_size, default: 7),
B is the number of boxes per cell (num_boxes, default: 2),
C is the number of object classes (num_classes, e.g., 20 for VOC).

This includes:

B bounding boxes (x, y, w, h, conf),
C class probabilities.

Use the get_loss method of the returned model to compute the training loss against the corresponding ground truth targets in the same format.

Tip

You can override architectural options like split_size, num_boxes, or loss coefficients via **kwargs to create variants of the YOLO-v1-tiny model.

Warning

Make sure the ground truth targets fed into the loss function match the required shape (N, S, S, 5 * B + C), with coordinates normalized to the grid and confidence + class vectors properly set.

Architectural Differences¶

YOLO-v1 (original): Uses a deeper ConvNet with 24 convolutional layers followed by 2 fully connected layers, enabling strong feature extraction but at the cost of computational load.
YOLO-v1-tiny: Replaces the backbone with a smaller ConvNet that has fewer convolutional layers and narrower channel sizes, reducing model size and computation while sacrificing some accuracy.

In practice, yolo_v1_tiny trades off detection performance for real-time speed on resource-limited devices.