# Conv2d¶

class Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, conv_mode='cross_correlation', compute_mode='default', **kwargs)[源代码]

For instance, given an input of the size $$(N, C_{\text{in}}, H, W)$$, this layer generates an output of the size $$(N, C_{\text{out}}, H_{\text{out}}, W_{\text{out}})$$ through the process described as below:

$\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) + \sum_{k = 0}^{C_{\text{in}} - 1} \text{weight}(C_{\text{out}_j}, k) \star \text{input}(N_i, k)$

input: $$(N, C_{\text{in}}, H_{\text{in}}, W_{\text{in}})$$

output: $$(N, C_{\text{out}}, H_{\text{out}}, W_{\text{out}})$$ where

$\text{H}_{out} = \lfloor \frac{\text{H}_{in} + 2 * \text{padding[0]} - \text{dilation[0]} * (\text{kernel_size[0]} - 1) - 1}{\text{stride[0]}} + 1 \rfloor$
$\text{W}_{out} = \lfloor \frac{\text{W}_{in} + 2 * \text{padding[1]} - \text{dilation[1]} * (\text{kernel_size[1]} - 1) - 1}{\text{stride[1]}} + 1 \rfloor$

groups == in_channelsout_channels == K * in_channels ，其中 K 是正整数，该操作也被称为深度方向卷积（depthwise convolution）。

In other words, for an input of size $$(N, C_{in}, H_{in}, W_{in})$$, a depthwise convolution with a depthwise multiplier K, can be constructed by arguments $$(in\_channels=C_{in}, out\_channels=C_{in} \times K, ..., groups=C_{in})$$.

• weight usually has shape (out_channels, in_channels, height, width) ,

if groups is not 1, shape will be (groups, out_channels // groups, in_channels // groups, height, width)

• bias usually has shape (1, out_channels, *1)

import numpy as np
import megengine as mge
import megengine.module as M

m = M.Conv2d(in_channels=3, out_channels=1, kernel_size=3)
inp = mge.tensor(np.arange(0, 96).astype("float32").reshape(2, 3, 4, 4))
oup = m(inp)
print(oup.numpy().shape)


(2, 1, 2, 2)