yolo_v3¶

lucid.models.yolo_v3(num_classes: int = 80, **kwargs) → YOLO_V3¶

The yolo_v3 function returns an instance of the YOLO_V3 model, preconfigured with the original YOLO-v3 architecture and default 3-scale detection.

Total Parameters: 62,974,149 (MS-COCO)

Function Signature¶

def yolo_v3(num_classes: int = 80, **kwargs) -> YOLO_V3

Parameters¶

num_classes (int, default=80): Number of object categories to detect. Determines the number of output scores per anchor.
kwargs: Additional keyword arguments passed directly to the YOLO_V3 constructor.

Common kwargs include:
- anchors (list[tuple[int, int]], optional): list of 9 anchor boxes in pixel units
- image_size (int, default=416): input resolution (typically 416x416)
- darknet (nn.Module, optional): custom Darknet-53 style backbone

Returns¶

YOLO_V3: An initialized instance of the YOLOv3 detection model with 3 detection heads for large, medium, and small objects.

Example Usage¶

>>> from lucid.models import yolo_v3
>>> model = yolo_v3(num_classes=80)
>>> print(model)

>>> x = lucid.rand(1, 3, 416, 416)
>>> out = model(x)
>>> for o in out:
...     print(o.shape)
# Each output shape: (1, 3 * (5 + 80), H, W)

Note

This model uses 3 detection heads at strides 32, 16, and 8. With an input size of 416, these correspond to grid sizes of 13x13, 26x26, and 52x52. Each head predicts 3 bounding boxes per cell, with output dimensions based on the number of classes.