Use Transform to define data transformation

Note

Transformation of input data is a very common operation, especially in the field of computer vision.

The various data transformations provided in: py:mod:megengine.data.transform are implemented based on the abstract class: py:class:~.Transform, of which:

  • `` apply`` method for a single abstract data samples, the need to achieve ** ** subclasses ( :ref:`Examples below <custom-transform-guide>’);

  • Various transformation operations can be combined by Compose, which is more convenient to use.

We can easily perform corresponding transformation operations when loading data in DataLoader. For example:

>>> dataloader = DataLoader(dataset, transform=Compose([Resize(32), ToMode('CHW')]))

For more API, please refer to: py:mod:megengine.data.transform module.

Note

With the help of data transformation, we can achieve various goals, including but not limited to:

  • Through the ``Resize’’ operation, the shape of the input data meets the requirements of the model;

  • Implement :ref:`Data augmentation <data-augmentation>(Data augmentation), more data can often improve the performance of the model…

See also

  • A large number of VisionTransform implementations are provided in MegEngine, and users can also refer to the API documentation for expansion;

  • Some data transformation implementations are referenced from torchvision and OpenMMLab.

  • MegEngine also provides: py:class:TorchTransformCompose implementation, which is convenient to use the implementation in torchvision.

Example:pseudo transformation and custom transformation

MegEngine provides PseudoTransform as the default implementation, it does not perform any processing on the input, but directly returns:

class PseudoTransform(Transform):
    def apply(self, input: Tuple):
        return input

We construct a data ``data’’ for testing:

>>> data = np.arange(9).reshape(3, 3)
>>> data
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
>>> from megengine.data.transform import PseudoTransform
>>> PseudoTransform().apply(data)
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

If we want to implement a custom transformation operation, we only need to implement the corresponding ``apply’’ logic by ourselves.

For example, we implement an ``AddOneTransform’’:

>>> from megengine.data.transform import Transform
>>> class AddOneTransform(Transform):
...     def apply(self, input):
...         return input + 1
>>> AddOneTransform().apply(data)
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

`` Compose`` may be used to combine data into:

>>> from megengine.data.transform import Compose
>>> composed_transform = Compose([AddOneTransform(), AddOneTransform()])
>>> composed_transform.apply(data)
array([[ 2,  3,  4],
       [ 5,  6,  7],
       [ 8,  9, 10]])

Finally, our various Transform implementations should be applied to DataLoader:

>>> dataloader = DataLoader(dataset, transform=composed_transform)

Warning

The example we give here is relatively simple, assuming that the samples are all single elements, in fact the apply method supports Tuple type input, and the code logic can handle more complex sample structures. You can refer to:py:class: Implementation of VisionTransform.

See also

可以在官方 ResNet 训练代码 official/vision/classification/resnet/train.py 中找到 DataLoader 通过组合数据变换对数据进行预处理的例子。

Note the difference with Functional

Users should not be `` megengine.data.transform`` and `` megengine.functional`` Interface confused:

  • megengine.data.transform can be regarded as an independent sub-library, which can perform various processing on NumPy’s ndarray data;

  • The implementation in ``megengine.functional’’ is all around the Tensor data structure of MegEngine.

From the process point of view, the user can convert the original data into an ndarray as input, and do some processing through megengine.data.transform. If you need to participate in model training, the results obtained need to be artificially converted into Tensor before they can be used in the interface in megengine.functional.

When should data preprocessing happen

When we get batch data from DataLoader, if ``Transform’’ is defined, it will be transformed immediately after each sample is loaded.

The data transformation operation also has computational overhead, and the process is usually performed on the CPU device, and some operations will call a library similar to OpenCV. If we load each sample multiple times (such as training multiple cycles), the transformation operation will also be performed multiple times, which may bring additional overhead. Therefore, in some cases, we will choose to perform the preprocessing operation in an earlier process, that is, directly perform a preprocessing operation on the original data first, so that the input obtained in the DataLoader is already preprocessed Data, so that you can reduce the Transform operation as much as possible.

Users should consider that I/O and processing related to raw data may also become a bottleneck in the overall process of model training.