Use Transform to define data transformation#
Note
Transformation of input data is a very common operation, especially in the field of computer vision.
The various data transformations provided in: py:mod:megengine.data.transform are implemented based on the abstract class: py:class:~.Transform, of which:
`` apply`` method for a single abstract data samples, the need to achieve ** ** subclasses ( :ref:`Examples below <custom-transform-guide>’);
Various transformation operations can be combined by
Compose
, which is more convenient to use.
We can easily perform corresponding transformation operations when loading data in DataLoader
. For example:
>>> dataloader = DataLoader(dataset, transform=Compose([Resize(32), ToMode('CHW')]))
For more API, please refer to: py:mod:megengine.data.transform module.
Note
With the help of data transformation, we can achieve various goals, including but not limited to:
See also
A large number of
VisionTransform
implementations are provided in MegEngine, and users can also refer to the API documentation for expansion;Some data transformation implementations are referenced from torchvision and OpenMMLab.
MegEngine also provides: py:class:TorchTransformCompose implementation, which is convenient to use the implementation in
torchvision
.
Example:pseudo transformation and custom transformation#
MegEngine provides PseudoTransform
as the default implementation, it does not perform any processing on the input, but directly returns:
class PseudoTransform(Transform):
def apply(self, input: Tuple):
return input
We construct a data ``data’’ for testing:
>>> data = np.arange(9).reshape(3, 3)
>>> data
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> from megengine.data.transform import PseudoTransform
>>> PseudoTransform().apply(data)
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
If we want to implement a custom transformation operation, we only need to implement the corresponding ``apply’’ logic by ourselves.
For example, we implement an ``AddOneTransform’’:
>>> from megengine.data.transform import Transform
>>> class AddOneTransform(Transform):
... def apply(self, input):
... return input + 1
>>> AddOneTransform().apply(data)
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
`` Compose`` may be used to combine data into:
>>> from megengine.data.transform import Compose
>>> composed_transform = Compose([AddOneTransform(), AddOneTransform()])
>>> composed_transform.apply(data)
array([[ 2, 3, 4],
[ 5, 6, 7],
[ 8, 9, 10]])
Finally, our various Transform
implementations should be applied to DataLoader
:
>>> dataloader = DataLoader(dataset, transform=composed_transform)
Warning
The example we give here is relatively simple, assuming that the samples are all single elements, in fact the apply
method supports Tuple type input, and the code logic can handle more complex sample structures. You can refer to:py:class: Implementation of VisionTransform.
See also
可以在官方 ResNet 训练代码 official/vision/classification/resnet/train.py
中找到 DataLoader
通过组合数据变换对数据进行预处理的例子。
Note the difference with Functional#
Users should not be `` megengine.data.transform`` and `` megengine.functional`` Interface confused:
megengine.data.transform
can be regarded as an independent sub-library, which can perform various processing on NumPy’s ndarray data;The implementation in ``megengine.functional’’ is all around the Tensor data structure of MegEngine.
From the process point of view, the user can convert the original data into an ndarray as input, and do some processing through megengine.data.transform
. If you need to participate in model training, the results obtained need to be artificially converted into Tensor before they can be used in the interface in megengine.functional
.
When should data preprocessing happen#
When we get batch data from DataLoader
, if ``Transform’’ is defined, it will be transformed immediately after each sample is loaded.
The data transformation operation also has computational overhead, and the process is usually performed on the CPU device, and some operations will call a library similar to OpenCV
. If we load each sample multiple times (such as training multiple cycles), the transformation operation will also be performed multiple times, which may bring additional overhead. Therefore, in some cases, we will choose to perform the preprocessing operation in an earlier process, that is, directly perform a preprocessing operation on the original data first, so that the input obtained in the DataLoader
is already preprocessed Data, so that you can reduce the Transform
operation as much as possible.
Users should consider that I/O and processing related to raw data may also become a bottleneck in the overall process of model training.