megengine.functional.nn.roi_align¶
- roi_align(inp, rois, output_shape, mode='average', spatial_scale=1.0, sample_points=2, aligned=True)[source]¶
Applies RoI (Region of Interest) align on input feature, as described in Mask R-CNN.
See also
- Parameters
inp (
Tensor
) – the input tensor that represents the input feature with(n, c, h, w)
shape.rois (
Tensor
) – a tensor represents Regions of Interest with shape(K, 5)
, which means totalK
box coordinates in(idx, x1, y1, x2, y2)
format where the regions will be taken from. The coordinate including(x1, y1)
and(x2, y2)
must satisfy0 <= x1 < x2
and0 <= y1 < y2
. The first columnidx
should contain the index of the corresponding element in the input batch, i.e. a number in[0, n - 1]
.output_shape (
Union
[int
,tuple
,list
]) –(height, width)
shape of output rois feature.mode (
str
) – “max” or “average”, use max/average align just like max/average pooling. Default: “average”spatial_scale (
float
) – scale the input boxes by this number. Default: 1.0sample_points (
Union
[int
,tuple
,list
]) – number of inputs samples to take for each output sample. 0 to take samples densely. Default: 2aligned (
bool
) – wheather to align the input feature, withaligned=True
, we first appropriately scale the ROI and then shift it by -0.5. Default: True
- Return type
- Returns
output tensor.
Examples
>>> import numpy as np >>> np.random.seed(42) >>> inp = Tensor(np.random.randn(1, 1, 128, 128)) >>> rois = Tensor(np.random.random((4, 5))) >>> y = F.vision.roi_align(inp, rois, (2, 2)) >>> y.numpy()[0].round(decimals=4) array([[[0.175 , 0.175 ], [0.1359, 0.1359]]], dtype=float32)