8.1.1.7.1.3.1.3. blueoil.networks.object_detection.yolo_v2
¶
8.1.1.7.1.3.1.3.1. Module Contents¶
8.1.1.7.1.3.1.3.1.1. Classes¶
YOLO version2. |
|
YOLO v2 loss function. |
8.1.1.7.1.3.1.3.1.2. Functions¶
|
Draw bounding boxes images on Tensorboard. |
|
Format form (x, y, w, h) to (center_x, center_y, w, h) along specific dimension. |
|
Format form (center_x, center_y, w, h) to (x, y, w, h) along specific dimension. |
|
Format from (x, y, w, h) to (y1, x1, y2, x2) boxes along specific dimension. |
|
Format from (x, y, w, h) to (y1, x1, y2, x2) boxes along specific dimension. |
-
class
blueoil.networks.object_detection.yolo_v2.
YoloV2
(num_max_boxes=5, anchors=[(0.25, 0.25), (0.5, 0.5), (1.0, 1.0)], leaky_relu_scale=0.1, object_scale=5.0, no_object_scale=1.0, class_scale=1.0, coordinate_scale=1.0, loss_iou_threshold=0.6, weight_decay_rate=0.0005, score_threshold=0.05, nms_iou_threshold=0.5, nms_max_output_size=100, nms_per_class=True, loss_warmup_steps=200, is_dynamic_image_size=False, use_cross_entropy_loss=True, change_base_output=False, *args, **kwargs)¶ Bases:
blueoil.networks.base.BaseNetwork
YOLO version2.
YOLO v2. paper: https://arxiv.org/abs/1612.08242
-
placeholders
(self)¶ placeholders
-
summary
(self, output, labels=None)¶ Summary.
- Parameters
output – tensor from inference.
labels – labels tensor.
-
metrics
(self, output, labels, thresholds=[0.3, 0.5, 0.7])¶ Metrics.
- Parameters
output – tensor from inference.
labels – labels tensor.
-
convert_gt_boxes_xywh_to_cxcywh
(self, gt_boxes)¶ Convert gt_boxes format form (x, y, w, h) to (center_x, center_y, w, h).
- Parameters
gt_boxes – 3D tensor [batch_size, max_num_boxes, 5(x, y, w, h, class_id)]
-
static
py_offset_boxes
(num_cell_y, num_cell_x, batch_size, boxes_per_cell, anchors)¶ Numpy implementing of offset_boxes. Return yolo space offset of x and y and w and h.
- Parameters
num_cell_y – Number of cell y. The spatial dimension of the final convolutional features.
num_cell_x – Number of cell x. The spatial dimension of the final convolutional features.
batch_size – int, Batch size.
boxes_per_cell – int, number of boxes per cell.
anchors – list of tuples.
-
offset_boxes
(self)¶ Return yolo space offset of x and y and w and h.
- Returns
shape is [batch_size, num_cell[0], num_cell[1], boxes_per_cell] offset_y: shape is [batch_size, num_cell[0], num_cell[1], boxes_per_cell] offset_w: shape is [batch_size, num_cell[0], num_cell[1], boxes_per_cell] offset_h: shape is [batch_size, num_cell[0], num_cell[1], boxes_per_cell]
- Return type
offset_x
-
convert_boxes_space_from_real_to_yolo
(self, boxes)¶ Convert boxes space size from real to yolo.
Real space boxes coordinates are in the interval [0, image_size]. Yolo space boxes x,y are in the interval [-1, 1]. w,h are in the interval [-inf, +inf].
- Parameters
boxes – 5D Tensor, shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].
- Returns
- 5D Tensor,
shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].
- Return type
resized_boxes
-
convert_boxes_space_from_yolo_to_real
(self, predict_boxes)¶ Convert predict boxes space size from yolo to real.
Real space boxes coordinates are in the interval [0, image_size]. Yolo space boxes x,y are in the interval [-1, 1]. w,h are in the interval [-inf, +inf].
- Parameters
boxes – 5D Tensor, shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].
- Returns
- 5D Tensor,
shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].
- Return type
resized_boxes
-
_predictions
(self, output)¶
-
_split_predictions
(self, output)¶ Separate combined final convolution outputs to predictions.
- Parameters
output – combined final convolution outputs 4D Tensor. shape is [batch_size, num_cell[0], num_cell[1], (num_classes + 5) * boxes_per_cell]
- Returns
[batch_size, num_cell[0], num_cell[1], boxes_per_cell, num_classes] predict_confidence(Tensor): [batch_size, num_cell[0], num_cell[1], boxes_per_cell, 1] predict_boxes(Tensor): [batch_size, num_cell[0], num_cell[1], boxes_per_cell, 4(center_x, center_y, w, h)]
- Return type
predict_classes(Tensor)
-
_concat_predictions
(self, predict_classes, predict_confidence, predict_boxes)¶ Concat predictions to inference output.
-
post_process
(self, output)¶
-
_format_output
(self, output)¶ Format yolov2 inference output to predict boxes list.
- Parameters
output – Tensor of inference() outputs.
- Returns
List of predict_boxes Tensor. The Shape is [batch_size, num_predicted_boxes, 6(x, y, w, h, class_id, score)]. The score be calculated by for each class probability and confidence.
-
_exclude_low_score_box
(self, formatted_output, threshold=0.05)¶ Exclude low score boxes. The score be calculated by for each class probability and confidence.
- Parameters
formatted_output – Formatted predict_boxes Tensor. The Shape is [batch_size, num_predicted_boxes, 6(x, y, w, h, class_id, score)].
threshold – low threshold of predict score.
- Returns
python list of predict_boxes Tensor. predict_boxes shape is [num_predicted_boxes, 6(x, y, w, h, class_id, probability)]. python list lenght is batch size.
-
_nms
(self, formatted_output, iou_threshold, max_output_size, per_class)¶ Non Maximum Suppression.
- Parameters
formatted_output – python list of predict_boxes Tensor. predict_boxes shape is [num_predicted_boxes, 6(x, y, w, h, class_id, probability)].
iou_threshold (float) – The threshold for deciding whether boxes overlap with respect to IOU.
max_output_size (int) – The maximum number of boxes to be selected
per_class (boolean) – Whether or not, NMS respect to per class.
- Returns
python list of predict_boxes Tensor. predict_boxes shape is [num_predicted_boxes, 6(x, y, w, h, class_id, probability)]. python list lenght is batch size.
-
loss
(self, output, gt_boxes)¶ Loss.
- Parameters
output – 2D tensor. shape is [batch_size, self.num_cell * self.num_cell * (self.num_classes + self.boxes_per_cell * 5)]
gt_boxes – ground truth boxes 3D tensor. [batch_size, max_num_boxes, 4(x, y, w, h, class_id)].
-
inference
(self, images, is_training)¶ Inference.
- Parameters
images – images tensor. shape is (batch_num, height, width, channel)
-
_reorg
(self, name, inputs, stride, data_format, use_space_to_depth=True, darknet_original=False)¶
-
base
(self, images, is_training)¶ Base network.
- Returns: Output. output shape depends on parameter.
When data_format is NHWC shape is [
batch_size, num_cell[0], num_cell[1], (num_classes + 5(x, y ,w, h, confidence)) * boxes_per_cell(length of anchors),
]
When data_format is NCHW shape is [
batch_size, (num_classes + 5(x, y ,w, h, confidence)) * boxes_per_cell(length of anchors), num_cell[0], num_cell[1],
]
-
-
class
blueoil.networks.object_detection.yolo_v2.
YoloV2Loss
(is_debug=False, anchors=[(1.0, 1.0), (2.0, 2.0)], num_cell=[4, 4], boxes_per_cell=2, object_scale=5.0, no_object_scale=1.0, class_scale=1.0, coordinate_scale=1.0, loss_iou_threshold=0.6, weight_decay_rate=0.0005, image_size=[448, 448], batch_size=64, classes=[], yolo=None, warmup_steps=100, use_cross_entropy_loss=True)¶ YOLO v2 loss function.
-
_iou_per_gtbox
(self, boxes, box)¶ Calculate ious.
- Parameters
boxes – 4-D np.ndarray [num_cell, num_cell, boxes_per_cell, 4(x_center, y_center, w, h)]
box – 1-D np.ndarray [4(x_center, y_center, w, h)]
- Returns
3-D np.ndarray [num_cell, num_cell, boxes_per_cell]
- Return type
iou
-
__iou_gt_boxes
(self, boxes, gt_boxes_list, num_cell)¶
-
_iou_gt_boxes
(self, boxes, gt_boxes_list)¶ Calculate ious between predict box and gt box. And choice best iou for each gt box.
- Parameters
boxes – Predicted boxes in real space coordinate. 5-D tensor [batch_size, num_cell, num_cell, boxes_per_cell, 4(x_center, y_center, w, h)].
gt_boxes_list – 5-D tensor [batch_size, max_num_boxes, 5(center_x, center_y, w, h, class_id)]
- Returns
4-D tensor [batch_size, num_cell, num_cell, boxes_per_cell]
- Return type
iou
-
_one_iou
(self, box1, box2)¶
-
__calculate_truth_and_masks
(self, gt_boxes_list, predict_boxes, num_cell, image_size, global_step, predict_classes=None, predict_confidence=None)¶ Calculate truth and masks for loss function from gt boxes and predict boxes.
1. When global steps is less than warmup_steps, set cell_gt_boxes and coordinate_masks to manage coordinate loss for early training steps to encourage predictions to match anchor.
About not dummy gt_boxes, calculate between gt boxes and anchor iou, and select best anchor.
3. In the best anchor, create cell_gt_boxes from the gt_boxes and calculate truth_confidence and assign masks true.
- Parameters
gt_boxes_list (np.ndarray) – The ground truth boxes. Shape is [batch_size, max_num_boxes, 5(center_x, center_y, w, h, class_id)].
predict_boxes (np.ndarray) – Predicted boxes. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].
num_cell – Number of cell. num_cell[0] is y axis, num_cell[1] is x axis.
image_size – Image size(px). image_size[0] is height, image_size[1] is width.
global_step (int) – Number of current training step.
predict_classes – Use only in debug. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, num_classes]
predict_confidence – Use only in debug. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, 1]
- Returns
- The cell anchor corresponding gt_boxes from gt_boxes_list. Dummy cell gt boxes are zeros.
shape is [batch_size, num_cell, num_cell, box_per_cell, 5(center_x, center_y, w, h, class_id)].
- truth_confidence: The confidence values each for cell anchors.
Tensor shape is [batch_size, num_cell, num_cell, box_per_cell, 1].
- object_masks: The cell anchor that has gt boxes is 1, none is 0.
Tensor shape is [batch_size, num_cell, num_cell, box_per_cell, 1].
- coordinate_masks: the cell anchor that has gt boxes is 1, none is 0.
Tensor [batch_size, num_cell, num_cell, box_per_cell, 1].
- Return type
cell_gt_boxes
-
_calculate_truth_and_masks
(self, gt_boxes_list, predict_boxes, global_step, predict_classes=None, predict_confidence=None)¶ Calculate truth and masks for loss function from gt boxes and predict boxes.
- Parameters
gt_boxes_list – The ground truth boxes. Tensor shape is [batch_size, max_num_boxes, 5(center_x, center_y, w, h, class_id)].
predict_boxes – Predicted boxes. Tensor shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].
global_step – Integer tensor.
predict_classes – Use only in debug. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, num_classes]
predict_confidence – Use only in debug. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, 1]
- Returns
- The cell anchor corresponding gt_boxes from gt_boxes_list. Dummy cell gt boxes are zeros.
shape is [batch_size, num_cell, num_cell, box_per_cell, 5(center_x, center_y, w, h, class_id)].
- truth_confidence: The confidence values each for cell anchors.
Tensor shape is [batch_size, num_cell, num_cell, box_per_cell, 1].
- object_masks: The cell anchor that has gt boxes is 1, none is 0.
Tensor shape is [batch_size, num_cell, num_cell, box_per_cell, 1].
- coordinate_masks: the cell anchor that has gt boxes is 1, none is 0.
Tensor [batch_size, num_cell, num_cell, box_per_cell, 1].
- Return type
cell_gt_boxes
-
_weight_decay_loss
(self)¶ L2 weight decay (regularization) loss.
-
__call__
(self, predict_classes, predict_confidence, predict_boxes, gt_boxes, global_step)¶ Loss function.
- Parameters
predict_classes – [batch_size, num_cell, num_cell, boxes_per_cell, num_classes]
predict_confidence – [batch_size, num_cell, num_cell, boxes_per_cell, 1]
predict_boxes – [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)]
gt_boxes – ground truth boxes 3D tensor. [batch_size, max_num_boxes, 5(center_x, center_y, w, h, class_id)].
global_step – integer tensor.
- Returns
loss value scalar tensor.
- Return type
loss
-
-
blueoil.networks.object_detection.yolo_v2.
summary_boxes
(tag, images, boxes, image_size, max_outputs=3, data_format='NHWC')¶ Draw bounding boxes images on Tensorboard.
Args: tag: name of summary tag. images: Tensor of images [batch_size, height, widths, 3]. boxes: Tensor of boxes. assumed shape is [batch_size, num_boxes, 4(y1, x1, y2, x2)]. image_size: python list image size [height, width].
-
blueoil.networks.object_detection.yolo_v2.
format_XYWH_to_CXCYWH
(boxes, axis=1)¶ Format form (x, y, w, h) to (center_x, center_y, w, h) along specific dimension.
Args: boxes :a Tensor include boxes. [:, 4(x, y, w, h)] axis: which dimension of the inputs Tensor is boxes.
-
blueoil.networks.object_detection.yolo_v2.
format_CXCYWH_to_XYWH
(boxes, axis=1)¶ Format form (center_x, center_y, w, h) to (x, y, w, h) along specific dimension.
Args: boxes: A tensor include boxes. [:, 4(x, y, w, h)] axis: Which dimension of the inputs Tensor is boxes.
-
blueoil.networks.object_detection.yolo_v2.
format_CXCYWH_to_YX
(inputs, axis=1)¶ Format from (x, y, w, h) to (y1, x1, y2, x2) boxes along specific dimension.
- Parameters
inputs – a Tensor include boxes.
axis – which dimension of the inputs Tensor is boxes.
-
blueoil.networks.object_detection.yolo_v2.
format_XYWH_to_YX
(inputs, axis=1)¶ Format from (x, y, w, h) to (y1, x1, y2, x2) boxes along specific dimension.
- Parameters
inputs – a Tensor include boxes.
axis – which dimension of the inputs Tensor is boxes.