Struct ComputingGraph::Options

Struct Documentation

struct mgb::cg::ComputingGraph::Options

Public Functions

const OprAttribute &get_opr_attribute(OperatorNodeBase *opr) const

get attribute for an operator

Public Members

struct mgb::cg::ComputingGraph::Options::OprAttribute opr_attribute
struct mgb::cg::ComputingGraph::Options::SeqOpt seq_opt
mgb::cg::ComputingGraph::Options::GraphOpt graph_opt
int16_t graph_opt_level = 2

graph optimization level: 0: disable 1: level-1: inplace arith transformations during graph construction 2: level-2: level-1, plus global optimization before graph compiling 3: also enable JIT <0: corresponding level, with result check for debug

int16_t allreduce_pack_max_size = 0

max size of allreduce packs in MB set this option to zero to disable PackAllReducePass

int16_t allreduce_pack_ignore_first = 2

do not pack the first n allreduces PackAllReducePass disabled if allreduce_pack_max_size is zero

uint16_t log_level = 1

set logging level, larger number means more verbose 0: no log info 1: static memory allocation status WorkspaceLimitGetter summary optimizer summary

  1. optimizer var replace details during graph compiling duplicated operator

uint16_t async_exec_level = 1

async exec: dispatch on separate threads for different comp_node 0: do not perform async dispatch 1: dispatch async if there are more than one comp node with limited queue mask 0b10: async if there are multiple comp nodes with mask 0b100: always async

bool force_dynamic_alloc = false

force dynamic memory alloc for all vars

bool var_sanity_check_first_run = true

whether to perform var sanity check on first run

bool allocate_static_mem_after_graph_compile = false

whether to allocate static memory just after compiling graph

bool fake_next_exec = false

whether only to perform non-computing tasks (like memory allocation and queue initialization) for next exec. This would be reset to false when the graph is executed.

bool enable_sublinear_memory_opt = false

whether to enable sublinear memory optimization

struct mgb::cg::ComputingGraph::Options::SublinearMemConfig sublinear_mem_config
bool no_profiling_on_shape_change = false

do not re-profile to select best impl algo when input shape changes (use previous algo)

bool enable_var_mem_defragment = true

whether to perform defragmenting when memory allocation for a dynamic var fails

bool enable_grad_var_static_reshape = false

whether to reshape grad var whose wrt shape is statically inferrable but its own shape is dynamic

bool enable_memory_swap = false

whether to enable swap memory as swap’s performance is greatly worse than sublinear, it is recommended to use sublinear first

uint8_t comp_node_seq_record_level = 0

whether to use CompNodeSeqRecorder to record the execution sequence and directly replay it for later executions.

Level 1 is mainly used to speed up execution (especially for opencl); level 2 is used for reducing memory usage.

Level 1 constraints:

  1. All vars must be statically allocated

  2. Host input/output buffer pointers can not be changed if shape is not changed (this is not checked in execution for efficiency considerations; this is potentially dangerous)

  3. Synchronization can only occur at the end of execution

  4. Not all comp node implementations support recording computing sequence

  5. Only one comp node can be used in the graph

Level 2: besides recording the computing sequence, the dependencies are also moved into the compiled func (see GraphExecutable::ExecDependency). Additional constraints:

  1. Shapes can not change

  2. both fake_next_exec and var_sanity_check_first_run must be disabled

  3. Var shapes must be correctly setup before calling compile()

bool eager_evaluation = false

whether to evaulate var node values as they are inserted

bool imperative_proxy_graph = false
bool no_force_inplace = false

Request that operators should not force update their inputs.

THIS FLAG IS RESERVED FOR INTERNAL USE

When this flag is set, operators like AddUpdate and BatchNorm will still attempt to inplace update their inputs, but failing to do so will not be considered as an error.

ThinHashMap<VarNode*, VarNodeArray> extra_vardeps

add extra deps for the comp seq if a specific var is dependent

UserDataContainer user_data

contains any user data associated with this graph

struct GraphOpt : public mgb::cg::GraphCommonOptimizeOptions

graph optimization options

Public Members

uint8_t jit = 0

whether to enable JIT; JIT would also be enabled at O3 this value indicates JIT level: 1 for basic elemwise opr; 2 for including reduce oprs

bool tensorrt = false

whether to enable fine-grained TensorRT opr replace

struct OprAttribute

attribute for a specific operator

struct SeqOpt

sequence compile optimization options

Public Members

bool enable_mem_plan_opt = true

whether to enable memory forwarding to optimize mem plans

bool enable_mem_reuse_alloc = true

whether to enable static memory reuse (i.e. using optimized static memory allocation algorithm)

bool enable_seq_comp_node_opt = true

whether to enable comp node optimization (e.g. using copy stream for I/O operators)

struct SublinearMemConfig

Control parameter for sublinear memory optimization.

Public Members

int thresh_nr_try = 10
int genetic_nr_iter = 0
int genetic_pool_size = 20
int lb_memory = 0
int num_worker = sys::get_cpu_count() / 2