yolo_v2

ConvNet One-Stage Detector Object Detection

lucid.models.yolo_v2(num_classes: int = 20, **kwargs) YOLO_V2

The yolo_v2 function returns an instance of the YOLO_V2 model, preconfigured with the original YOLO-v2 architecture.

Total Parameters: 21,287,133

Function Signature

def yolo_v2(num_classes: int = 20, **kwargs) -> YOLO_V2

Parameters

  • num_classes (int, default=20): Number of object categories to detect. This sets the number of output class scores per anchor in the final detection head.

  • kwargs: Additional keyword arguments passed directly to the YOLO_V2 constructor.

    Common kwargs include:

    • darknet (nn.Module, optional): custom backbone

    • num_anchors (int, default=5): number of anchor boxes (defaults to 5)

    • image_size (int, default=416): input image resolution

Returns

  • YOLO_V2: An initialized instance of the YOLO_V2 detection model.

Example Usage

>>> from lucid.models import yolo_v2
>>> model = yolo_v2(num_classes=20)
>>> print(model)

>>> x = lucid.rand(1, 3, 416, 416)
>>> out = model(x)
>>> print(out.shape)  # shape: (1, 125, 13, 13) for Pascal-VOC (20 classes, 5 anchors)

Note

The model follows the original YOLO-v2 design with 5 anchor boxes and a grid output shape of \(S \times S \times (B \times (5 + C))\), where \(C\) is the number of classes, \(B=5\) is the number of anchors, and \(S=13\) when input image size is 416.