megengine.data package

megengine.data.collator

class megengine.data.collator.Collator[source]

Bases: object

Used for merging a list of samples to form a mini-batch of Tensor(s). Used when using batched loading from a dataset. Modified from https://github.com/pytorch/pytorch/blob/master/torch/utils/data/_utils/collate.py

apply(inputs)[source]
Parameters

input – sequence_N(tuple(CHW, C, CK)).

Returns

tuple(NCHW, NC, NCK).

megengine.data.dataloader

class megengine.data.dataloader.DataLoader(dataset, sampler=None, transform=None, collator=None, num_workers=0, timeout=0, divide=False)[source]

Bases: object

__init__(dataset, sampler=None, transform=None, collator=None, num_workers=0, timeout=0, divide=False)[source]

Provides a convenient way to iterate on a given dataset.

DataLoader combines a dataset with sampler, transform and collator, make it flexible to get minibatch continually from a dataset.

Parameters
  • dataset (Dataset) – dataset from which to load the minibatch.

  • sampler (Optional[Sampler]) – defines the strategy to sample data from the dataset.

  • transform (Optional[Transform]) – defined the transforming strategy for a sampled batch. Default: None

  • collator (Optional[Collator]) – defined the merging strategy for a transformed batch. Default: None

  • num_workers (int) – the number of sub-process to load, transform and collate the batch. 0 means using single-process. Default: 0

  • timeout (int) – if positive, means the timeout value(second) for collecting a batch from workers. Default: 0

  • divide (bool) – define the paralleling strategy in multi-processing mode. True means one batch is divided into num_workers pieces, and the workers will process these pieces parallelly. False means different sub-process will process different batch. Default: False

megengine.data.sampler

class megengine.data.sampler.Infinite(sampler)[source]

Bases: megengine.data.sampler.MapSampler

Infinite Sampler warper for basic sampler.

sample()[source]

Return a list contains all sample indices.

class megengine.data.sampler.MapSampler(dataset, batch_size=1, drop_last=False, num_samples=None, world_size=None, rank=None, seed=None)[source]

Bases: megengine.data.sampler.Sampler

__init__(dataset, batch_size=1, drop_last=False, num_samples=None, world_size=None, rank=None, seed=None)[source]

An abstract class for all sampler.

Parameters
  • dataset (dataset) – dataset to sample from.

  • batch_size (positive integer) – batch size for batch method.

  • drop_last (bool) – set True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch_size, then the last batch will be smaller. Default: False

  • num_samples (positive integer) – number of samples assigned to one rank.

  • world_size (positive integer) – number of ranks.

  • rank (non-negative integer within 0 and world_size) – rank id, non-negative interger within 0 and world_size.

  • seed (non-negative integer) – seed for random operators.

batch()[source]

Batch method provides a batch indices generator.

Return type

Iterator[List[Any]]

sample()[source]

Return a list contains all sample indices.

scatter(indices)[source]

Scatter method is used for splitting indices into subset, each subset will be assigned to a rank. Indices are evenly splitted by default. If customized indices assignment method is needed, please rewrite this method.

Return type

List

class megengine.data.sampler.RandomSampler(dataset, batch_size=1, drop_last=False, indices=None, world_size=None, rank=None, seed=None)[source]

Bases: megengine.data.sampler.MapSampler

__init__(dataset, batch_size=1, drop_last=False, indices=None, world_size=None, rank=None, seed=None)[source]

Sample elements randomly without replacement.

sample()[source]

Return a list contains all sample indices.

Return type

List

class megengine.data.sampler.ReplacementSampler(dataset, batch_size=1, drop_last=False, num_samples=None, weights=None, world_size=None, rank=None, seed=None)[source]

Bases: megengine.data.sampler.MapSampler

__init__(dataset, batch_size=1, drop_last=False, num_samples=None, weights=None, world_size=None, rank=None, seed=None)[source]

Sample elements randomly with replacement.

Parameters

weights (List) – weights for sampling indices, it could be unnormalized weights.

sample()[source]

Return a list contains all sample indices.

Return type

List

class megengine.data.sampler.Sampler[source]

Bases: abc.ABC

An abstract class for all Sampler

class megengine.data.sampler.SequentialSampler(dataset, batch_size=1, drop_last=False, indices=None, world_size=None, rank=None)[source]

Bases: megengine.data.sampler.MapSampler

__init__(dataset, batch_size=1, drop_last=False, indices=None, world_size=None, rank=None)[source]

Sample elements sequentially.

sample()[source]

Return a generator.

Return type

Iterator[Any]

class megengine.data.sampler.StreamSampler(batch_size=1)[source]

Bases: megengine.data.sampler.Sampler

Sampler for stream dataset.

Warning

In the case of multiple machines, sampler should ensure that each worker gets different data. But this class cannot do it yet, please build your own dataset and sampler to achieve this goal.

Usually, meth::~.StreamDataset.__iter__ can return different iterator by rank = dist.get_rank(). So that they will get different data.

megengine.data.transform.meta_transform

class megengine.data.transform.meta_transform.PseudoTransform[source]

Bases: megengine.data.transform.meta_transform.Transform

apply(input)[source]
class megengine.data.transform.meta_transform.Transform[source]

Bases: abc.ABC

Rewrite apply method in subclass.

abstract apply(input)[source]
apply_batch(inputs)[source]

megengine.data.transform.vision.functional

megengine.data.transform.vision.functional.flip(image, flipCode)[source]

Accordding to the flipCode (the type of flip), flip the input image.

Parameters
  • image – input image, with (H, W, C) shape.

  • flipCode

    code that indicates the type of flip.

    • 1 : Flip horizontally

    • 0 : Flip vertically

    • -1: Flip horizontally and vertically

Returns

BGR format image, with (H, W, C) shape.

megengine.data.transform.vision.functional.pad(input, size, value)[source]

Pad input data with value and given size.

Parameters
  • input – input data, with (H, W, C) shape.

  • size – padding size of input data, it could be integer or sequence. If it is an integer, the input data will be padded in four directions. If it is a sequence contains two integer, the bottom and right side of input data will be padded. If it is a sequence contains four integer, the top, bottom, left, right side of input data will be padded with given size.

  • value – padding value of data, could be a sequence of int or float. If it is float value, the dtype of image will be casted to float32 also.

Returns

padded image.

megengine.data.transform.vision.functional.resize(input, size, interpolation=1)[source]

Resize the input data to given size.

Parameters
  • input – input data, could be image or masks, with (H, W, C) shape.

  • size – target size of input data, with (height, width) shape.

  • interpolation – interpolation method.

Returns

resized data, with (H, W, C) shape.

megengine.data.transform.vision.functional.to_bgr(image)[source]

Change gray format image’s color space to BGR.

Parameters

image – input Gray format image, with (H, W, C) shape.

Returns

BGR format image, with (H, W, C) shape.

megengine.data.transform.vision.functional.to_gray(image)[source]

Change BGR format image’s color space to gray.

Parameters

image – input BGR format image, with (H, W, C) shape.

Returns

gray format image, with (H, W, C) shape.

megengine.data.transform.vision.functional.wrap_keepdims(func)[source]

Wraper to keep the dimension of input images unchanged.

megengine.data.transform.vision.transform

class megengine.data.transform.vision.transform.BrightnessTransform(value, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Adjust brightness of the input data.

Parameters
  • value – how much to adjust the brightness. Can be any non negative number. 0 gives the original image.

  • order – the same with VisionTransform.

class megengine.data.transform.vision.transform.CenterCrop(output_size, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Crops the given the input data at the center.

Parameters
  • output_size – target size of output image, with (height, width) shape.

  • order – the same with VisionTransform.

apply(input)[source]

Apply transform on single input data.

class megengine.data.transform.vision.transform.ColorJitter(brightness=0, contrast=0, saturation=0, hue=0, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Randomly change the brightness, contrast, saturation and hue of an image.

Parameters
  • brightness – how much to jitter brightness. Chosen uniformly from [max(0, 1 - brightness), 1 + brightness] or the given [min, max]. Should be non negative numbers.

  • contrast – how much to jitter contrast. Chosen uniformly from [max(0, 1 - contrast), 1 + contrast] or the given [min, max]. Should be non negative numbers.

  • saturation – how much to jitter saturation. Chosen uniformly from [max(0, 1 - saturation), 1 + saturation] or the given [min, max]. Should be non negative numbers.

  • hue – how much to jitter hue. Chosen uniformly from [-hue, hue] or the given [min, max]. Should have 0<= hue <= 0.5 or -0.5 <= min <= max <= 0.5.

  • order – the same with VisionTransform.

apply(input)[source]

Apply transform on single input data.

class megengine.data.transform.vision.transform.Compose(transforms=[], batch_compose=False, shuffle_indices=None, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Composes several transforms together.

Parameters
  • transforms – list of VisionTransform to compose.

  • batch_compose – whether use shuffle_indices for batch data or not. If True, use original input sequence. Otherwise, the shuffle_indices will be used for transforms.

  • shuffle_indices – indices used for random shuffle, start at 1. For example, if shuffle_indices is [(1, 3), (2, 4)], then the 1st and 3rd transform will be random shuffled, the 2nd and 4th transform will also be shuffled.

  • order – the same with VisionTransform

Examples:

from megengine.data.transform import RandomHorizontalFlip, RandomVerticalFlip, CenterCrop, ToMode, Compose

transform_func = Compose([
    RandomHorizontalFlip(),
    RandomVerticalFlip(),
    CenterCrop(100),
    ToMode("CHW"),
    ],
    shuffle_indices=[(1, 2, 3)]
    )
apply(input)[source]

Apply transform on single input data.

apply_batch(inputs)[source]

Apply transform on batch input data.

class megengine.data.transform.vision.transform.ContrastTransform(value, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Adjust contrast of the input data.

Parameters
  • value – how much to adjust the contrast. Can be any non negative number. 0 gives the original image.

  • order – the same with VisionTransform.

class megengine.data.transform.vision.transform.GaussianNoise(mean=0.0, std=1.0, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Add random gaussian noise to the input data. Gaussian noise is generated with given mean and std.

Parameters
  • mean – Gaussian mean used to generate noise.

  • std – Gaussian standard deviation used to generate noise.

  • order – the same with VisionTransform

class megengine.data.transform.vision.transform.HueTransform(value, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Adjust hue of the input data.

Parameters
  • value – how much to adjust the hue. Can be any number between 0 and 0.5, 0 gives the original image.

  • order – the same with VisionTransform.

class megengine.data.transform.vision.transform.Lighting(scale, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

class megengine.data.transform.vision.transform.Normalize(mean=0.0, std=1.0, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Normalize the input data with mean and standard deviation. Given mean: (M1,...,Mn) and std: (S1,..,Sn) for n channels, this transform will normalize each channel of the input data. output[channel] = (input[channel] - mean[channel]) / std[channel]

Parameters
  • mean – sequence of means for each channel.

  • std – sequence of standard deviations for each channel.

  • order – the same with VisionTransform.

class megengine.data.transform.vision.transform.Pad(size=0, value=0, mask_value=0, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Pad the input data.

Parameters
  • size – padding size of input image, it could be integer or sequence. If it is an integer, the input image will be padded in four directions. If it is a sequence containing two integers, the bottom and right side of image will be padded. If it is a sequence containing four integers, the top, bottom, left, right side of image will be padded with given size.

  • value – padding value of image, could be a sequence of int or float. if it is float value, the dtype of image will be casted to float32 also.

  • mask_value – padding value of segmentation map.

  • order – the same with VisionTransform.

class megengine.data.transform.vision.transform.RandomCrop(output_size, padding_size=0, padding_value=[0, 0, 0], padding_maskvalue=0, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Crop the input data randomly. Before applying the crop transform, pad the image first. If target size is still bigger than the size of padded image, pad the image size to target size.

Parameters
  • output_size – target size of output image, with (height, width) shape.

  • padding_size – the same with size in Pad.

  • padding_value – the same with value in Pad.

  • order – the same with VisionTransform.

apply(input)[source]

Apply transform on single input data.

class megengine.data.transform.vision.transform.RandomHorizontalFlip(prob=0.5, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Horizontally flip the input data randomly with a given probability.

Parameters
  • p – probability of the input data being flipped. Default: 0.5

  • order – the same with VisionTransform.

apply(input)[source]

Apply transform on single input data.

class megengine.data.transform.vision.transform.RandomResize(scale_range, interpolation=1, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Resize the input data randomly.

Parameters
  • scale_range – range of scaling.

  • order – the same with VisionTransform.

apply(input)[source]

Apply transform on single input data.

class megengine.data.transform.vision.transform.RandomResizedCrop(output_size, scale_range=0.08, 1.0, ratio_range=0.75, 1.3333333333333333, interpolation=1, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Crop the input data to random size and aspect ratio. A crop of random size (default: of 0.08 to 1.0) of the original size and a random aspect ratio (default: of 3/4 to 1.33) of the original aspect ratio is made. After applying crop transfrom, the input data will be resized to given size.

Parameters
  • output_size – target size of output image, with (height, width) shape.

  • scale_range – range of size of the origin size cropped. Default: (0.08, 1.0)

  • ratio_range – range of aspect ratio of the origin aspect ratio cropped. Default: (0.75, 1.33)

  • order – the same with VisionTransform.

apply(input)[source]

Apply transform on single input data.

class megengine.data.transform.vision.transform.RandomVerticalFlip(prob=0.5, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Vertically flip the input data randomly with a given probability.

Parameters
  • p – probability of the input data being flipped. Default: 0.5

  • order – the same with VisionTransform.

apply(input)[source]

Apply transform on single input data.

class megengine.data.transform.vision.transform.Resize(output_size, interpolation=1, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Resize the input data.

Parameters
  • output_size – target size of image, with (height, width) shape.

  • interpolation

    interpolation method. All methods are listed below:

    • cv2.INTER_NEAREST – a nearest-neighbor interpolation.

    • cv2.INTER_LINEAR – a bilinear interpolation (used by default).

    • cv2.INTER_AREA – resampling using pixel area relation.

    • cv2.INTER_CUBIC – a bicubic interpolation over 4×4 pixel neighborhood.

    • cv2.INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood.

  • order – the same with VisionTransform.

apply(input)[source]

Apply transform on single input data.

class megengine.data.transform.vision.transform.SaturationTransform(value, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Adjust saturation of the input data.

Parameters
  • value – how much to adjust the saturation. Can be any non negative number. 0 gives the original image.

  • order – the same with VisionTransform.

class megengine.data.transform.vision.transform.ShortestEdgeResize(min_size, max_size, sample_style='range', interpolation=1, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

apply(input)[source]

Apply transform on single input data.

class megengine.data.transform.vision.transform.ToMode(mode='CHW', *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Change input data to a target mode. For example, most transforms use HWC mode image, while the neural network might use CHW mode input tensor.

Parameters
  • mode – output mode of input. Default: “CHW”

  • order – the same with VisionTransform

class megengine.data.transform.vision.transform.TorchTransformCompose(transforms, *, order=None)[source]

Bases: megengine.data.transform.vision.transform.VisionTransform

Compose class used for transforms in torchvision, only support PIL image, some transforms with tensor in torchvision are not supported, such as Normalize and ToTensor in torchvision.

Parameters
  • transforms – the same with Compose.

  • order – the same with VisionTransform.

class megengine.data.transform.vision.transform.VisionTransform(order=None)[source]

Bases: megengine.data.transform.meta_transform.Transform

Base class of all transforms used in computer vision. Calling logic: apply_batch() -> apply() -> _apply_image() and other _apply_*() method. If you want to implement a self-defined transform method for image, rewrite _apply_image method in subclass.

Parameters

order

input type order. Input is a tuple containing different structures, order is used to specify the order of structures. For example, if your input is (image, boxes) type, then the order should be (“image”, “boxes”). Current available strings and data type are describe below:

  • ”image”: input image, with shape of (H, W, C).

  • ”coords”: coordinates, with shape of (N, 2).

  • ”boxes”: bounding boxes, with shape of (N, 4), “xyxy” format, the 1st “xy” represents top left point of a box, the 2nd “xy” represents right bottom point.

  • ”mask”: map used for segmentation, with shape of (H, W, 1).

  • ”keypoints”: keypoints with shape of (N, K, 3), N for number of instances, and K for number of keypoints in one instance. The first two dimensions of last axis is coordinate of keypoints and the the 3rd dimension is the label of keypoints.

  • ”polygons”: a sequence containing numpy arrays, its length is the number of instances. Each numpy array represents polygon coordinate of one instance.

  • ”category”: categories for some data type. For example, “image_category” means category of the input image and “boxes_category” means categories of bounding boxes.

  • ”info”: information for images such as image shapes and image path.

You can also customize your data types only if you implement the corresponding _apply_*() methods, otherwise NotImplementedError will be raised.

apply(input)[source]

Apply transform on single input data.

apply_batch(inputs)[source]

Apply transform on batch input data.

megengine.data.dataset.meta_dataset

class megengine.data.dataset.meta_dataset.ArrayDataset(*arrays)[source]

Bases: megengine.data.dataset.meta_dataset.Dataset

__init__(*arrays)[source]
ArrayDataset is a dataset for numpy array data, one or more numpy arrays

are needed to initiate the dataset. And the dimensions represented sample number are expected to be the same.

class megengine.data.dataset.meta_dataset.Dataset[source]

Bases: abc.ABC

An abstract class for all datasets. __getitem__ and __len__ method are aditionally needed.

class megengine.data.dataset.meta_dataset.StreamDataset[source]

Bases: megengine.data.dataset.meta_dataset.Dataset

An abstract class for stream data. __iter__ method is aditionally needed.

megengine.data.dataset.vision.cifar

class megengine.data.dataset.vision.cifar.CIFAR10(root=None, train=True, download=True, timeout=500)[source]

Bases: megengine.data.dataset.vision.meta_vision.VisionDataset

Dataset for CIFAR10 meta data.

bytes2array(filenames)[source]
download()[source]
property meta
meta_info = {'name': 'batches.meta'}
process()[source]
raw_file_dir = 'cifar-10-batches-py'
raw_file_md5 = 'c58f30108f718f92721af3b95e74349a'
raw_file_name = 'cifar-10-python.tar.gz'
test_batch = ['test_batch']
train_batch = ['data_batch_1', 'data_batch_2', 'data_batch_3', 'data_batch_4', 'data_batch_5']
untar(file_path, dirs)[source]
url_path = 'http://www.cs.utoronto.ca/~kriz/'
class megengine.data.dataset.vision.cifar.CIFAR100(root=None, train=True, download=True, timeout=500)[source]

Bases: megengine.data.dataset.vision.cifar.CIFAR10

bytes2array(filenames)[source]
property meta
meta_info = {'name': 'meta'}
raw_file_dir = 'cifar-100-python'
raw_file_md5 = 'eb9058c3a382ffc7106e4002c42a8d85'
raw_file_name = 'cifar-100-python.tar.gz'
test_batch = ['test']
train_batch = ['train']
url_path = 'http://www.cs.utoronto.ca/~kriz/'

megengine.data.dataset.vision.cityscapes

class megengine.data.dataset.vision.cityscapes.Cityscapes(root, image_set, mode, *, order=None)[source]

Bases: megengine.data.dataset.vision.meta_vision.VisionDataset

Cityscapes Dataset.

class_names = ('road', 'sidewalk', 'building', 'wall', 'fence', 'pole', 'traffic light', 'traffic sign', 'vegetation', 'terrain', 'sky', 'person', 'rider', 'car', 'truck', 'bus', 'train', 'motorcycle', 'bicycle')
supported_order = ('image', 'mask', 'info')

megengine.data.dataset.vision.coco

class megengine.data.dataset.vision.coco.COCO(root, ann_file, remove_images_without_annotations=False, *, order=None)[source]

Bases: megengine.data.dataset.vision.meta_vision.VisionDataset

MS COCO Dataset.

class_names = ('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush')
classes_originID = {'airplane': 5, 'apple': 53, 'backpack': 27, 'banana': 52, 'baseball bat': 39, 'baseball glove': 40, 'bear': 23, 'bed': 65, 'bench': 15, 'bicycle': 2, 'bird': 16, 'boat': 9, 'book': 84, 'bottle': 44, 'bowl': 51, 'broccoli': 56, 'bus': 6, 'cake': 61, 'car': 3, 'carrot': 57, 'cat': 17, 'cell phone': 77, 'chair': 62, 'clock': 85, 'couch': 63, 'cow': 21, 'cup': 47, 'dining table': 67, 'dog': 18, 'donut': 60, 'elephant': 22, 'fire hydrant': 11, 'fork': 48, 'frisbee': 34, 'giraffe': 25, 'hair drier': 89, 'handbag': 31, 'horse': 19, 'hot dog': 58, 'keyboard': 76, 'kite': 38, 'knife': 49, 'laptop': 73, 'microwave': 78, 'motorcycle': 4, 'mouse': 74, 'orange': 55, 'oven': 79, 'parking meter': 14, 'person': 1, 'pizza': 59, 'potted plant': 64, 'refrigerator': 82, 'remote': 75, 'sandwich': 54, 'scissors': 87, 'sheep': 20, 'sink': 81, 'skateboard': 41, 'skis': 35, 'snowboard': 36, 'spoon': 50, 'sports ball': 37, 'stop sign': 13, 'suitcase': 33, 'surfboard': 42, 'teddy bear': 88, 'tennis racket': 43, 'tie': 32, 'toaster': 80, 'toilet': 70, 'toothbrush': 90, 'traffic light': 10, 'train': 7, 'truck': 8, 'tv': 72, 'umbrella': 28, 'vase': 86, 'wine glass': 46, 'zebra': 24}
get_img_info(index)[source]
keypoint_names = ('nose', 'left_eye', 'right_eye', 'left_ear', 'right_ear', 'left_shoulder', 'right_shoulder', 'left_elbow', 'right_elbow', 'left_wrist', 'right_wrist', 'left_hip', 'right_hip', 'left_knee', 'right_knee', 'left_ankle', 'right_ankle')
supported_order = ('image', 'boxes', 'boxes_category', 'keypoints', 'info')
megengine.data.dataset.vision.coco.has_valid_annotation(anno, order)[source]

megengine.data.dataset.vision.folder

class megengine.data.dataset.vision.folder.ImageFolder(root, check_valid_func=None, class_name=False)[source]

Bases: megengine.data.dataset.vision.meta_vision.VisionDataset

__init__(root, check_valid_func=None, class_name=False)[source]

ImageFolder is a class for loading image data and labels from a organized folder.

The folder is expected to be organized as followed: root/cls/xxx.img_ext

Labels are indices of sorted classes in the root directory.

Parameters
  • root (str) – root directory of an image folder.

  • loader – a function used to load image from path, if None, default function that loads images with PIL will be called.

  • check_valid_func – a function used to check if files in folder are expected image files, if None, default function that checks file extensions will be called.

  • class_name (bool) – if True, return class name instead of class index.

collect_class()[source]
Return type

Dict

collect_samples()[source]
Return type

List

megengine.data.dataset.vision.imagenet

class megengine.data.dataset.vision.imagenet.ImageNet(root=None, train=True, **kwargs)[source]

Bases: megengine.data.dataset.vision.folder.ImageFolder

Load ImageNet from raw files or folder. Expected folder looks like:

${root}/
|       [REQUIRED TAR FILES]
|-  ILSVRC2012_img_train.tar
|-  ILSVRC2012_img_val.tar
|-  ILSVRC2012_devkit_t12.tar.gz
|       [OPTIONAL IMAGE FOLDERS]
|-  train/cls/xxx.${img_ext}
|-  val/cls/xxx.${img_ext}
|-  ILSVRC2012_devkit_t12/data/meta.mat
|-  ILSVRC2012_devkit_t12/data/ILSVRC2012_validation_ground_truth.txt

If the image folders don’t exist, raw tar files are required to get extracted and processed.

__init__(root=None, train=True, **kwargs)[source]

Initialization:

  • if root contains self.target_folder depending on train:

    • initialize ImageFolder with target_folder.

  • else:

    • if all raw files are in root:

      • parse self.target_folder from raw files.

      • initialize ImageFolder with self.target_folder.

    • else:

      • raise error.

Parameters
  • root (Optional[str]) – root directory of imagenet data, if root is None, use default_dataset_root.

  • train (bool) – if True, load the train split, otherwise load the validation split.

check_raw_file()[source]
Return type

bool

default_devkit_dir = 'ILSVRC2012_devkit_t12'
default_train_dir = 'train'
default_val_dir = 'val'
property meta
raw_file_meta = {'devkit': ('ILSVRC2012_devkit_t12.tar.gz', 'fa75699e90414af021442c21a62c3abf'), 'train': ('ILSVRC2012_img_train.tar', '1d675b47d978889d74fa0da5fadfb00e'), 'val': ('ILSVRC2012_img_val.tar', '29b22e2961454d5413ddabcf34fc5622')}
property valid_ground_truth

megengine.data.dataset.vision.meta_vision

class megengine.data.dataset.vision.meta_vision.VisionDataset(root, *, order=None, supported_order=None)[source]

Bases: megengine.data.dataset.meta_dataset.Dataset

megengine.data.dataset.vision.mnist

class megengine.data.dataset.vision.mnist.MNIST(root=None, train=True, download=True, timeout=500)[source]

Bases: megengine.data.dataset.vision.meta_vision.VisionDataset

Dataset for MNIST meta data.

__init__(root=None, train=True, download=True, timeout=500)[source]
Parameters
  • root (Optional[str]) – path for mnist dataset downloading or loading, if None, set root to the _default_root.

  • train (bool) – if True, loading trainingset, else loading test set.

  • download (bool) – if raw files do not exists and download sets to True, download raw files and process, otherwise raise ValueError, default is True.

download()[source]
property meta
process(train)[source]
raw_file_md5 = ['f68b3c2dcbeaaa9fbdd348bbdeb94873', 'd53e105ee54ea40749a09fcbcd1e9432', '9fb629c4189551a2d022fa330f9573f3', 'ec29112dd5afa0611ce80d1b7f02629c']

Md5 for checking raw files.

raw_file_name = ['train-images-idx3-ubyte.gz', 'train-labels-idx1-ubyte.gz', 't10k-images-idx3-ubyte.gz', 't10k-labels-idx1-ubyte.gz']

Raw file names of both training set and test set (10k).

url_path = 'http://yann.lecun.com/exdb/mnist/'

Url prefix for downloading raw file.

megengine.data.dataset.vision.mnist.parse_idx1(idx1_file)[source]
megengine.data.dataset.vision.mnist.parse_idx3(idx3_file)[source]

megengine.data.dataset.vision.objects365

class megengine.data.dataset.vision.objects365.Objects365(root, ann_file, remove_images_without_annotations=False, *, order=None)[source]

Bases: megengine.data.dataset.vision.meta_vision.VisionDataset

Objects365 Dataset.

class_names = ('person', 'sneakers', 'chair', 'hat', 'lamp', 'bottle', 'cabinet/shelf', 'cup', 'car', 'glasses', 'picture/frame', 'desk', 'handbag', 'street lights', 'book', 'plate', 'helmet', 'leather shoes', 'pillow', 'glove', 'potted plant', 'bracelet', 'flower', 'tv', 'storage box', 'vase', 'bench', 'wine glass', 'boots', 'bowl', 'dining table', 'umbrella', 'boat', 'flag', 'speaker', 'trash bin/can', 'stool', 'backpack', 'couch', 'belt', 'carpet', 'basket', 'towel/napkin', 'slippers', 'barrel/bucket', 'coffee table', 'suv', 'toy', 'tie', 'bed', 'traffic light', 'pen/pencil', 'microphone', 'sandals', 'canned', 'necklace', 'mirror', 'faucet', 'bicycle', 'bread', 'high heels', 'ring', 'van', 'watch', 'sink', 'horse', 'fish', 'apple', 'camera', 'candle', 'teddy bear', 'cake', 'motorcycle', 'wild bird', 'laptop', 'knife', 'traffic sign', 'cell phone', 'paddle', 'truck', 'cow', 'power outlet', 'clock', 'drum', 'fork', 'bus', 'hanger', 'nightstand', 'pot/pan', 'sheep', 'guitar', 'traffic cone', 'tea pot', 'keyboard', 'tripod', 'hockey', 'fan', 'dog', 'spoon', 'blackboard/whiteboard', 'balloon', 'air conditioner', 'cymbal', 'mouse', 'telephone', 'pickup truck', 'orange', 'banana', 'airplane', 'luggage', 'skis', 'soccer', 'trolley', 'oven', 'remote', 'baseball glove', 'paper towel', 'refrigerator', 'train', 'tomato', 'machinery vehicle', 'tent', 'shampoo/shower gel', 'head phone', 'lantern', 'donut', 'cleaning products', 'sailboat', 'tangerine', 'pizza', 'kite', 'computer box', 'elephant', 'toiletries', 'gas stove', 'broccoli', 'toilet', 'stroller', 'shovel', 'baseball bat', 'microwave', 'skateboard', 'surfboard', 'surveillance camera', 'gun', 'life saver', 'cat', 'lemon', 'liquid soap', 'zebra', 'duck', 'sports car', 'giraffe', 'pumpkin', 'piano', 'stop sign', 'radiator', 'converter', 'tissue ', 'carrot', 'washing machine', 'vent', 'cookies', 'cutting/chopping board', 'tennis racket', 'candy', 'skating and skiing shoes', 'scissors', 'folder', 'baseball', 'strawberry', 'bow tie', 'pigeon', 'pepper', 'coffee machine', 'bathtub', 'snowboard', 'suitcase', 'grapes', 'ladder', 'pear', 'american football', 'basketball', 'potato', 'paint brush', 'printer', 'billiards', 'fire hydrant', 'goose', 'projector', 'sausage', 'fire extinguisher', 'extension cord', 'facial mask', 'tennis ball', 'chopsticks', 'electronic stove and gas stove', 'pie', 'frisbee', 'kettle', 'hamburger', 'golf club', 'cucumber', 'clutch', 'blender', 'tong', 'slide', 'hot dog', 'toothbrush', 'facial cleanser', 'mango', 'deer', 'egg', 'violin', 'marker', 'ship', 'chicken', 'onion', 'ice cream', 'tape', 'wheelchair', 'plum', 'bar soap', 'scale', 'watermelon', 'cabbage', 'router/modem', 'golf ball', 'pine apple', 'crane', 'fire truck', 'peach', 'cello', 'notepaper', 'tricycle', 'toaster', 'helicopter', 'green beans', 'brush', 'carriage', 'cigar', 'earphone', 'penguin', 'hurdle', 'swing', 'radio', 'CD', 'parking meter', 'swan', 'garlic', 'french fries', 'horn', 'avocado', 'saxophone', 'trumpet', 'sandwich', 'cue', 'kiwi fruit', 'bear', 'fishing rod', 'cherry', 'tablet', 'green vegetables', 'nuts', 'corn', 'key', 'screwdriver', 'globe', 'broom', 'pliers', 'volleyball', 'hammer', 'eggplant', 'trophy', 'dates', 'board eraser', 'rice', 'tape measure/ruler', 'dumbbell', 'hamimelon', 'stapler', 'camel', 'lettuce', 'goldfish', 'meat balls', 'medal', 'toothpaste', 'antelope', 'shrimp', 'rickshaw', 'trombone', 'pomegranate', 'coconut', 'jellyfish', 'mushroom', 'calculator', 'treadmill', 'butterfly', 'egg tart', 'cheese', 'pig', 'pomelo', 'race car', 'rice cooker', 'tuba', 'crosswalk sign', 'papaya', 'hair drier', 'green onion', 'chips', 'dolphin', 'sushi', 'urinal', 'donkey', 'electric drill', 'spring rolls', 'tortoise/turtle', 'parrot', 'flute', 'measuring cup', 'shark', 'steak', 'poker card', 'binoculars', 'llama', 'radish', 'noodles', 'yak', 'mop', 'crab', 'microscope', 'barbell', 'bread/bun', 'baozi', 'lion', 'red cabbage', 'polar bear', 'lighter', 'seal', 'mangosteen', 'comb', 'eraser', 'pitaya', 'scallop', 'pencil case', 'saw', 'table tennis paddle', 'okra', 'starfish', 'eagle', 'monkey', 'durian', 'game board', 'rabbit', 'french horn', 'ambulance', 'asparagus', 'hoverboard', 'pasta', 'target', 'hotair balloon', 'chainsaw', 'lobster', 'iron', 'flashlight')
get_img_info(index)[source]
supported_order = ('image', 'boxes', 'boxes_category', 'info')

megengine.data.dataset.vision.utils

megengine.data.dataset.vision.utils.calculate_md5(filename)[source]
megengine.data.dataset.vision.utils.is_img(filename)[source]
megengine.data.dataset.vision.utils.load_raw_data_from_url(url, filename, target_md5, raw_data_dir, timeout)[source]
megengine.data.dataset.vision.utils.untar(path, to=None, remove=False)[source]
megengine.data.dataset.vision.utils.untargz(path, to=None, remove=False)[source]

megengine.data.dataset.vision.voc

class megengine.data.dataset.vision.voc.PascalVOC(root, image_set, *, order=None)[source]

Bases: megengine.data.dataset.vision.meta_vision.VisionDataset

Pascal VOC Dataset.

class_names = ('aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor')
get_img_info(index, image=None)[source]
parse_voc_xml(node)[source]
supported_order = ('image', 'boxes', 'boxes_category', 'mask', 'info')