Conv3d¶
- class Conv3d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, conv_mode='cross_correlation')[source]¶
Applies a 3D convolution over an input tensor.
For instance, given an input of the size \((N, C_{\text{in}}, T, H, W)\), this layer generates an output of the size \((N, C_{\text{out}}, T_{\text{out}}, H_{\text{out}}, W_{\text{out}})\) through the process described as below:
\[\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) + \sum_{k = 0}^{C_{\text{in}} - 1} \text{weight}(C_{\text{out}_j}, k) \star \text{input}(N_i, k)\]where \(\star\) is the valid 3D cross-correlation operator, \(N\) is batch size, \(C\) denotes number of channels.
When groups == in_channels and out_channels == K * in_channels, where K is a positive integer, this operation is also known as depthwise convolution.
In other words, for an input of size \((N, C_{\text{in}}, T_{\text{in}}, H_{\text{in}}, W_{\text{in}})\), a depthwise convolution with a depthwise multiplier K, can be constructed by arguments \((in\_channels=C_{\text{in}}, out\_channels=C_{\text{in}} \times K, ..., groups=C_{\text{in}})\).
- Parameters
in_channels (int) – number of input channels.
out_channels (int) – number of output channels.
kernel_size (Union[int, Tuple[int, int, int]]) – size of weight on spatial dimensions. If kernel_size is an
int
, the actual kernel size would be (kernel_size, kernel_size, kernel_size).stride (Union[int, Tuple[int, int, int]]) – stride of the 3D convolution operation. Default: 1.
padding (Union[int, Tuple[int, int, int]]) – size of the paddings added to the input on both sides of its spatial dimensions. Only zero-padding is supported. Default: 0.
dilation (Union[int, Tuple[int, int, int]]) – dilation of the 3D convolution operation. Default: 1.
groups (int) – number of groups into which the input and output channels are divided, so as to perform a
grouped convolution
. Whengroups
is not 1,in_channels
andout_channels
must be divisible bygroups
, and the shape of weight should be(groups, out_channel // groups, in_channels // groups, depth, height, width)
. Default: 1.bias (bool) – whether to add a bias onto the result of convolution. Default: True.
conv_mode (str) – supports cross_correlation. Default: cross_correlation.
- Shape:
input
: \((N, C_{\text{in}}, T_{\text{in}}, H_{\text{in}}, W_{\text{in}})\).output
: \((N, C_{\text{out}}, T_{\text{out}}, H_{\text{out}}, W_{\text{out}})\).
Note
weight
usually has shape(out_channels, in_channels, depth, height, width)
, if groups is not 1, shape will be(groups, out_channels // groups, in_channels // groups, depth, height, width)
bias
usually has shape(1, out_channels, *1)
- Returns
module. The instance of the
Conv3d
module.- Return type
Return type
Examples
>>> import numpy as np >>> m = M.Conv3d(in_channels=3, out_channels=1, kernel_size=3) >>> inp = mge.tensor(np.arange(0, 384).astype("float32").reshape(2, 3, 4, 4, 4)) >>> oup = m(inp) >>> oup.numpy().shape (2, 1, 2, 2, 2)