yolo_v3¶
ConvNet One-Stage Detector Object Detection
The yolo_v3 function returns an instance of the YOLO_V3 model, preconfigured with the original YOLO-v3 architecture and default 3-scale detection.
Total Parameters: 62,974,149 (MS-COCO)
Function Signature¶
def yolo_v3(num_classes: int = 80, **kwargs) -> YOLO_V3
Parameters¶
num_classes (int, default=80): Number of object categories to detect. Determines the number of output scores per anchor.
kwargs: Additional keyword arguments passed directly to the YOLO_V3 constructor.
Common kwargs include:
anchors (list[tuple[int, int]], optional): list of 9 anchor boxes in pixel units
image_size (int, default=416): input resolution (typically 416x416)
darknet (nn.Module, optional): custom Darknet-53 style backbone
Returns¶
YOLO_V3: An initialized instance of the YOLOv3 detection model with 3 detection heads for large, medium, and small objects.
Example Usage¶
>>> from lucid.models import yolo_v3
>>> model = yolo_v3(num_classes=80)
>>> print(model)
>>> x = lucid.rand(1, 3, 416, 416)
>>> out = model(x)
>>> for o in out:
... print(o.shape)
# Each output shape: (1, 3 * (5 + 80), H, W)
Note
This model uses 3 detection heads at strides 32, 16, and 8. With an input size of 416, these correspond to grid sizes of 13x13, 26x26, and 52x52. Each head predicts 3 bounding boxes per cell, with output dimensions based on the number of classes.