megengine.data.DataLoader

class DataLoader(dataset, sampler=None, transform=None, collator=None, num_workers=0, timeout=0, timeout_event=raise_timeout_error, divide=False, preload=False)[源代码]

可用于在给定数据集上遍历并取得数据。

DataLoader combines a dataset with Sampler, Transform and Collator, make it flexible to get minibatch continually from a dataset.

参数
  • dataset (Dataset) – dataset from which to load the minibatch.

  • sampler (Optional[Sampler]) – defines the strategy to sample data from the dataset.

  • transform (Optional[Transform]) – defined the transforming strategy for a sampled batch. Default: None

  • collator (Optional[Collator]) – defined the merging strategy for a transformed batch. Default: None

  • num_workers (int) – the number of sub-process to load, transform and collate the batch. 0 means using single-process. Default: 0

  • timeout (int) – if positive, means the timeout value(second) for collecting a batch from workers. Default: 0

  • timeout_event (Callable) – callback function triggered by timeout, default to raise runtime error.

  • divide (bool) – define the paralleling strategy in multi-processing mode. True means one batch is divided into num_workers pieces, and the workers will process these pieces parallelly. False means different sub-process will process different batch. Default: False

  • preload (bool) – Defines whether to apply the preloading strategy of dataloader, and parallelize the copy of host2device while kernal is executed to improve the loading speed. default is seted False the output will change from np.ndarry to dtype tensor. the support dtypes for preload are int,float,list[int,float],tuple[int,float],and another type is not supported.

方法