Axis-aligned Bounding Boxes¶
Axis-aligned rectangular bounding boxes have their sides, natuarally, aligned with coordinate axes in the multidimensional Eucledian space R^D. Axis-aligned boxes have fewer parameters than arbitrarily rotated boxes, which simplifies their learning and operations.
Box Format¶
A D-dimensional axis-aligned box in R^D is represented as a 2*D array of the coordinates of the bottom-left corner followed by the coordinates of the top-right corner. Throughout this module, all boxes are defined in “matrix convention” also known as “ij-indexed”.
In some other APIs (e.g. in Matplotlib), two-dimensional boxes are defined in the “cartesian” notation aka “xy”. Be very careful and transpose these boxes when using this API. E.g. a two-dimensional box in the ‘xy’ format would have the following representation in this module:
y1, x1, y2, x2
In a correct box, each component of the bottom corner will be smaller or equal than the corresponding component of top corner. The boxes where this is not the case are degenerate. It is undefined what values are returned for the degenerate boxes, but no exception will be raised.
API¶
-
tensorbank.tf.axis_aligned_boxes.
area
(a)[source]¶ Computes area of each box in a.
This function is primarily intended to use with batched box matching.
- Parameters
a (Tensor [..dims... x 2*D]) – box coordinates where D is the dimension of the euclidian space. See also, Box Format above.
- Returns
box volumes using the standard volume metric in R^D.
- Return type
Tensor [..dims…]
-
tensorbank.tf.axis_aligned_boxes.
evenly_spaced
(box_counts, box_sizes, image_shape, offset=None, dtype=tf.float32)[source]¶ Returns “anchor” boxes evenly spaced within the image.
We assume that the image is D-dimensional, and give examples for 2 dimensions. The length of all lists is equivalent to the number of scales S in the detector. Boxes for each scale are appended after the previous scale.
Please note that the box_sizes are defined in the tensor order. This is different from the common way to define the box sizes in the W,H order for 2D boxes.
Example
The following function will return 8 boxes: 2 boxes centered in each point of a 2x2 grid:
tb.axis_aligned_boxes.evenly_spaced( [ (2,2) ], [ [(10, 15), (15, 10)] ], (224, 224), ).numpy() >>> [[ -5. -7.5 5. 7.5] [ -7.5 -5. 7.5 5. ] [ -5. 104.5 5. 119.5] [ -7.5 107. 7.5 117. ] [107. -7.5 117. 7.5] [104.5 -5. 119.5 5. ] [107. 104.5 117. 119.5] [104.5 107. 119.5 117. ]]
- Parameters
box_counts (List (length S) of D-tuples) – number of elements in a grid along each axis for the box centers. The first image is at the offset defined by offset. Can be a tf.Tensor.
box_sizes (List (length S) of lists (length B_i) of lists (length D)) – box sizes for each scale. This shoudn’t be a Tensor since different scales can have different number of anchors.
image_shape – D-tuple that defines the overall image shape. Can be a tf.Tensor.
offset – List (length S) of D-tuples that define the offset of the first image from 0^D. Not yet implemented.
dtype (optional) – type of the returned tensor. Defaults to tf.float32
- Returns
list of boxes
- Return type
Tensor [number_of_boxes x 2*D]
-
tensorbank.tf.axis_aligned_boxes.
intersection
(a, b)[source]¶ Computes intersection box of each pair of boxes in a and b.
- Parameters
a (Tensor [N x K x 2*D]) – box coordinates. N is batch size, K is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.
b (Tensor [N x M x 2*D]) – box coordinates, N is batch size, M is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.
- Returns
pairwise box intersections. If two boxes do not intersect, their corresponding entry is a degenerate box of area 0.
- Return type
Tensor [N x K x M x 2*D]
-
tensorbank.tf.axis_aligned_boxes.
intersection_area
(a, b)[source]¶ Computes intersection area of each pair of boxes in a and b.
This function is primarily intended to use with batched anchor matching. If the number of boxes in each batch is different, simply pad the boxes with 0.0 and ignore the rows.
- Parameters
a (Tensor [N x K x 2*D]) – box coordinates. N is batch size, K is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.
b (Tensor [N x M x 2*D]) – box coordinates, N is batch size, M is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.
- Returns
pairwise box intersection areas using the standard volume metric in R^D.
- Return type
Tensor [N x K x M]
-
tensorbank.tf.axis_aligned_boxes.
iou
(a, b)[source]¶ Computes intersection over union of each pair of boxes in a and b.
This function is primarily intended to use with batched anchor matching. If the number of boxes in each batch is different, simply pad the boxes with 0.0 and ignore the rows.
When the boxes do not intersect, their IOU is 0.0. When a pair of boxes has the union area of 0.0 (e.g. when both boxes are empty) their IoU will be NaN.
- Parameters
a (Tensor [N x K x 2*D]) – box coordinates. N is batch size, K is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.
b (Tensor [N x M x 2D]) – box coordinates, N is batch size, M is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.
- Returns
pairwise box IoUs using the standard volume metric in R^D.
- Return type
Tensor [N x K x M]