Conv2d

class Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, conv_mode='cross_correlation', compute_mode='default', **kwargs)[源代码]

对输入张量进行二维卷积

For instance, given an input of the size \((N, C_{\text{in}}, H, W)\), this layer generates an output of the size \((N, C_{\text{out}}, H_{\text{out}}, W_{\text{out}})\) through the process described as below:

\[\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) + \sum_{k = 0}^{C_{\text{in}} - 1} \text{weight}(C_{\text{out}_j}, k) \star \text{input}(N_i, k)\]

其中 \(\star\) 是有效的2D互相关运算; \(N\) 是批大小; \(C\) 表示通道数; \(H\) 是以像素为单位输入平面的高度; \(W\) 是以像素为单位的平面宽度。

通常,输出的特征图的形状可以被下面的方式推导出来:

input: \((N, C_{\text{in}}, H_{\text{in}}, W_{\text{in}})\)

output: \((N, C_{\text{out}}, H_{\text{out}}, W_{\text{out}})\) where

\[\text{H}_{out} = \lfloor \frac{\text{H}_{in} + 2 * \text{padding[0]} - \text{dilation[0]} * (\text{kernel_size[0]} - 1) - 1}{\text{stride[0]}} + 1 \rfloor\]
\[\text{W}_{out} = \lfloor \frac{\text{W}_{in} + 2 * \text{padding[1]} - \text{dilation[1]} * (\text{kernel_size[1]} - 1) - 1}{\text{stride[1]}} + 1 \rfloor\]

groups == in_channelsout_channels == K * in_channels ,其中 K 是正整数,该操作也被称为深度方向卷积(depthwise convolution)。

In other words, for an input of size \((N, C_{in}, H_{in}, W_{in})\), a depthwise convolution with a depthwise multiplier K, can be constructed by arguments \((in\_channels=C_{in}, out\_channels=C_{in} \times K, ..., groups=C_{in})\).

参数
  • in_channels (int) – 输入数据中的通道数。

  • out_channels (int) – 输出数据中的通道数。

  • kernel_size (Union[int, Tuple[int, int]]) – size of weight on spatial dimensions. If kernel_size is an int, the actual kernel size would be (kernel_size, kernel_size).

  • stride (Union[int, Tuple[int, int]]) – 二维卷积运算的步长。默认:1

  • padding (Union[int, Tuple[int, int]]) – 输入数据空域维度两侧的填充(padding)大小。仅支持填充0值。默认:0

  • dilation (Union[int, Tuple[int, int]]) – 二维卷积运算的空洞(dilation)。默认:1

  • groups (int) – number of groups into which the input and output channels are divided, so as to perform a grouped convolution. When groups is not 1, in_channels and out_channels must be divisible by groups, and the shape of weight should be (groups, out_channel // groups, in_channels // groups, height, width). Default: 1

  • bias (bool) – 是否将偏置(bias)加入卷积的结果中。默认:True

  • conv_mode (str) – Supports cross_correlation. Default: cross_correlation

  • compute_mode (str) – When set to “default”, no special requirements will be placed on the precision of intermediate results. When set to “float32”, “float32” would be used for accumulator and intermediate result, but only effective when input and output are of float16 dtype.

注解

  • weight usually has shape (out_channels, in_channels, height, width) ,

    if groups is not 1, shape will be (groups, out_channels // groups, in_channels // groups, height, width)

  • bias usually has shape (1, out_channels, *1)

实际案例

import numpy as np
import megengine as mge
import megengine.module as M

m = M.Conv2d(in_channels=3, out_channels=1, kernel_size=3)
inp = mge.tensor(np.arange(0, 96).astype("float32").reshape(2, 3, 4, 4))
oup = m(inp)
print(oup.numpy().shape)

输出:

(2, 1, 2, 2)