all_reduce_sum(inp, group=WORLD, device=None)[源代码]

创建用于聚合通信的 all_reduce_sum 算子。

This operator sums the tensor data by coordinates across the specified group and returns a tensor with the shape of the input tensor.

  • inp (Tensor) – The tensor data to apply this operator on.

  • group (Optional[Group]) – The communication node list instance of :class:’Group’ to apply this operator across. The default group is WORLD which means all processes available.

  • device (Optional[str]) – The specific device type of :class:’str’ to execute this operator. The default device is None which mean the device of inp will be used.

  • devices. (Specify "cpu" or "gpu" to execute this operator on specific) –


The reduce sum tensor of the input tensor data across the specified group.




import megengine as mge
import megengine.distributed as dist
import numpy as np
from warnings import warn

def func(sum_value):
    # get the rank of this process, the ranks shold be 0, 1, 2, 3 for a 4 gpu task
    rank = dist.get_rank()
    data = mge.tensor(rank)
    # the result should be n * (n - 1) / 2 for all processes
    result = mge.functional.distributed.all_reduce_sum(data).item()
    assert result == sum_value

def main():
    p_num = mge.device.get_device_count("gpu")
    if p_num < 2:
        warn('This opr only works on group with more than one gpu')
    method = dist.launcher(func)
    method(p_num * (p_num - 1) // 2)

if __name__ == '__main__':