发射器(launcher)¶
- class launcher(*args, **kwargs)[源代码]¶
Decorator for launching multiple processes in single-machine/multi-machine multi-gpu training.
- 参数
func (Callable) – 你想要在分布式模式下启动的函数。
n_gpus (Number) – how many devices each node. If
n_gpusis None,n_gpuswill be the device count of current node. Default: None.world_size (Number) – how many devices totally. If
world_sizeis None,world_sizewill ben_gpus. Default: None.rank_start (Number) – the start rank number in current node. For single-machine multi-gpu training, rank_start should be
0. For multi-machine training,rank_startof Machineishould bei * n_gpus. Default: 0master_ip (str) – ip address for master node (where the rank 0 is placed). Default: “localhost”.
port (Number) – server port for distributed server. Default: 0.
backend (str)) – set default collective communication backend.
backendshould be “nccl” or “rccl”. Default: “nccl”.
参见
Examples of distributed training using
launcherdecorator can be found in _distributed-guide