Axis-aligned Bounding Boxes

Axis-aligned rectangular bounding boxes have their sides, natuarally, aligned with coordinate axes in the multidimensional Eucledian space R^D. Axis-aligned boxes have fewer parameters than arbitrarily rotated boxes, which simplifies their learning and operations.

Box Format

A D-dimensional axis-aligned box in R^D is represented as a 2*D array of the coordinates of the bottom-left corner followed by the coordinates of the top-right corner. Throughout this module, all boxes are defined in “matrix convention” also known as “ij-indexed”.

In some other APIs (e.g. in Matplotlib), two-dimensional boxes are defined in the “cartesian” notation aka “xy”. Be very careful and transpose these boxes when using this API. E.g. a two-dimensional box in the ‘xy’ format would have the following representation in this module:

y1, x1, y2, x2

In a correct box, each component of the bottom corner will be smaller or equal than the corresponding component of top corner. The boxes where this is not the case are degenerate. It is undefined what values are returned for the degenerate boxes, but no exception will be raised.

API

tensorbank.tf.axis_aligned_boxes.area(a)[source]

Computes area of each box in a.

This function is primarily intended to use with batched box matching.

Parameters

a (Tensor [..dims... x 2*D]) – box coordinates where D is the dimension of the euclidian space. See also, Box Format above.

Returns

box volumes using the standard volume metric in R^D.

Return type

Tensor [..dims…]

tensorbank.tf.axis_aligned_boxes.evenly_spaced(box_counts, box_sizes, image_shape, offset=None, dtype=tf.float32)[source]

Returns “anchor” boxes evenly spaced within the image.

We assume that the image is D-dimensional, and give examples for 2 dimensions. The length of all lists is equivalent to the number of scales S in the detector. Boxes for each scale are appended after the previous scale.

Please note that the box_sizes are defined in the tensor order. This is different from the common way to define the box sizes in the W,H order for 2D boxes.

Example

The following function will return 8 boxes: 2 boxes centered in each point of a 2x2 grid:

tb.axis_aligned_boxes.evenly_spaced(
    [ (2,2) ],
    [ [(10, 15), (15, 10)] ],
    (224, 224),
).numpy()

>>> [[ -5.   -7.5   5.    7.5]
     [ -7.5  -5.    7.5   5. ]
     [ -5.  104.5   5.  119.5]
     [ -7.5 107.    7.5 117. ]
     [107.   -7.5 117.    7.5]
     [104.5  -5.  119.5   5. ]
     [107.  104.5 117.  119.5]
     [104.5 107.  119.5 117. ]]
Parameters
  • box_counts (List (length S) of D-tuples) – number of elements in a grid along each axis for the box centers. The first image is at the offset defined by offset. Can be a tf.Tensor.

  • box_sizes (List (length S) of lists (length B_i) of lists (length D)) – box sizes for each scale. This shoudn’t be a Tensor since different scales can have different number of anchors.

  • image_shape – D-tuple that defines the overall image shape. Can be a tf.Tensor.

  • offset – List (length S) of D-tuples that define the offset of the first image from 0^D. Not yet implemented.

  • dtype (optional) – type of the returned tensor. Defaults to tf.float32

Returns

list of boxes

Return type

Tensor [number_of_boxes x 2*D]

tensorbank.tf.axis_aligned_boxes.intersection(a, b)[source]

Computes intersection box of each pair of boxes in a and b.

Parameters
  • a (Tensor [N x K x 2*D]) – box coordinates. N is batch size, K is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.

  • b (Tensor [N x M x 2*D]) – box coordinates, N is batch size, M is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.

Returns

pairwise box intersections. If two boxes do not intersect, their corresponding entry is a degenerate box of area 0.

Return type

Tensor [N x K x M x 2*D]

tensorbank.tf.axis_aligned_boxes.intersection_area(a, b)[source]

Computes intersection area of each pair of boxes in a and b.

This function is primarily intended to use with batched anchor matching. If the number of boxes in each batch is different, simply pad the boxes with 0.0 and ignore the rows.

Parameters
  • a (Tensor [N x K x 2*D]) – box coordinates. N is batch size, K is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.

  • b (Tensor [N x M x 2*D]) – box coordinates, N is batch size, M is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.

Returns

pairwise box intersection areas using the standard volume metric in R^D.

Return type

Tensor [N x K x M]

tensorbank.tf.axis_aligned_boxes.iou(a, b)[source]

Computes intersection over union of each pair of boxes in a and b.

This function is primarily intended to use with batched anchor matching. If the number of boxes in each batch is different, simply pad the boxes with 0.0 and ignore the rows.

When the boxes do not intersect, their IOU is 0.0. When a pair of boxes has the union area of 0.0 (e.g. when both boxes are empty) their IoU will be NaN.

Parameters
  • a (Tensor [N x K x 2*D]) – box coordinates. N is batch size, K is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.

  • b (Tensor [N x M x 2D]) – box coordinates, N is batch size, M is the number of boxes in a batch, D is the dimension of the euclidian space. See also, Box Format above.

Returns

pairwise box IoUs using the standard volume metric in R^D.

Return type

Tensor [N x K x M]