Class VarNode

Inheritance Relationships

Base Type

  • public GraphNodeBase

Class Documentation

class mgb::cg::VarNode : public GraphNodeBase

Node for a variable.

It must be the output of exactly one OperatorNode and may be input to other OperatorNode.

Each variable has an owner, the operator that generates this variable as one of the output.

VarNode class exposes most commonly used memory management interface

Public Types

enum Flag

Values:

enumerator NO_SYS_MEM_ALLOC = 1 << 0

do not allocate memory by the system allocator even if shape could be inferred

enumerator NO_ALLOC_IF_UNUSED = 1 << 1

do not allocate memory if value of this var is not used (i.e. VarReceiverInfo::value_used() returns false)

enumerator NO_SYS_STATIC_MEM_ALLOC = 1 << 2

do not allocate memory statically (would be allocated dynamically if possible); useful if a var in subgraph would be directly forwarded to a var in owner graph (e.g. in case for LAST output mode in Loop)

enumerator NO_MEM_RECLAIM = 1 << 3

do not reclaim memory if NO_SYS_MEM_ALLOC is set or this var has dynamic storage, memory would not be reclaimed after all readers are processed if this var has satic storage, its memory would not be reused by others

enumerator VOLATILE_CONTENT = 1 << 4

var node used as temporary storage, whose content should not be read by others

enumerator ALLOW_EMPTY_SHAPE = 1 << 5

allow this var to have empty shape, which means it would not consume any memory and it has nullptr as the underlying pointer; vars without this flag set would trigger an error during memory allocation to avoid uninitialized output var shape. This flag should be set by the owner opr.

enumerator PERSISTENT_DEVICE_VALUE = 1 << 6

value is always available on device even before opr is executed (e.g. SharedDeviceTensor), so various optimizations can be performed

enumerator DISALLOW_RT_FORCE_DYNAMIC_MEM_ALLOC = 1 << 7

disallow RT_FORCE_DYNAMIC_MEM_ALLOC added to this node during memory optimization; this is only applicable when the operator manages memory of this var manually, and the memory is never reclaimed. Must be used with NO_MEM_RECLAIM.

enumerator DISALLOW_VAR_SANITY_CHECK = 1 << 8

disable sanity check for this VarNode this flag is added for swap_memory; SwapInMS opr works as a trigger to make its output VarNode start copying from host parallelly, when SwapInMS finishs execute(), it is likely that its output tensor does not have ‘exact’ content, so we need to disable var_sanity_check in this case

enumerator RT_FORCE_DYNAMIC_MEM_ALLOC = 1 << 9

force dynamic memory allocation even if shape could be statically inferred; conflicts with NO_SYS_MEM_ALLOC

note that this is a runtime-flag, which would be cleared and re-evaluated on graph compiling; it is set up by VarNodeMemManager and propagated through

enumerator FLAG_FREEZED = 1 << 10

this flag indicates that the opr has been inserted into the graph and certain flags can not be modified. Only NO_MEM_RECLAIM, NO_SYS_STATIC_MEM_ALLOC and RT_FORCE_DYNAMIC_MEM_ALLOC flags can be added after FLAG_FREEZED is present.

enumerator MEMORY_NO_NEED = 1 << 11

this flag indicates that data of this var has been processed and no need later, it can be freed, this is used in weight preprocess for memory save

using LayoutConstraintCallback = thin_function<bool(const TensorLayout&)>

Public Functions

VarNode(Maybe<std::string> name, OperatorNodeBase *owner)

this constructor should only be called by OperatorNodeBase::add_output

implemented in core/impl/graph/operator_node.cpp

VarNode &add_layout_constraint(LayoutConstraintCallback callback)

add a callback function to check the validity of a particular tensor layout

If callback returns true, it means that this VarNode’s dev_tensor with given layout may be forwarded to opr directly, otherwise it will be implicitly rearranged to a contiguous one.

VarNode &add_layout_constraint_contiguous()

requires the layout to be contiguous

Note: since many oprs require inputs to be contiguous, this is implemented by marking a flag on the var rather than adding a LayoutConstraintCallback to check whether it is contiguous. All the existing callbacks would be cleared and new callbacks would be ignored after add_layout_constraint_contiguous() is invoked.

VarNode &add_layout_constraint_monotone()

requires the layout to be monotone while allowing broadcast

Note: similar to add_layout_constraint_contiguous() this is implemented by marking a flag; however user-defined callbacks are still invoked since they might impose stronger constraints.

MGB_WARN_UNUSED_RESULT bool set_fwd_in2out_readonly (VarNode *input, const SubTensorSpec &sub)

request that memory should be readonly forwarded from other var

Note that this function must be called from OperatorNodeBase::mem_plan_fwd_in2out_readonly.

Return

whether this request could be satisfied

VarNode &set_fwd_in2out_writable(VarNode *input)

request that this var share memory with another var, whose content would also be modified

Note that this function must be called from OperatorNodeBase::mem_plan_fwd_in2out_writable.

VarNode &set_fwd_in2out_writable_force(VarNode *input)

require this var to share memory from another var; only used for operators that have an explicit updating semantics

Note that this function must be called during operator node initialization

OperatorNodeBase *owner_opr() const
const std::string &name() const

get name; if name is not valid, get name of owner opr

const char *cname() const

get name as C-string

bool has_name_set() const

whether name is explicitly set,

VarNode &name(std::string name)

set name explicitly

DType dtype() const

get data type of data in this var

TensorFormat format() const

get tensor format in this var

VarNode &dtype(DType dtype)

set dtype; this function can only be called once

VarNode &format(TensorFormat format)

set format; this function can only be called once

MemAllocPlan &mem_plan()
bool dev_tensor_valid() const
const DeviceTensorND &dev_tensor() const

get the underlying device tensor to fill data

DeviceTensorND &mutable_dev_tensor()

get the underlying device tensor that can be modified(like resize())

This should only be called from the owner opr of this var, and this var must have flag NO_SYS_MEM_ALLOC.

const void *prev_dev_ptr() const

previous dev ptr before deallocating dev_tensor; used for testing and debugging

CompNode comp_node() const

get the comp node on which this var is computed

VarNode &comp_node(const CompNode &cn)

set comp node; only the memory node could be changed if called multiple times

const TensorShape &shape() const
size_t refcnt() const

get current reference count; not thread safe, and only used for testing purposes

VarNode &shape(const TensorShape &shape)

reset VarNode shape

Return

whether shape differs from old shape

bool allow_shape_change() const
const TensorLayout &layout() const
void add_shape_update_callback(void *tag, thin_function<void(VarNode*)> cb)

add a callback to be executed when shape of this var is updated

Parameters
  • tag: callback tag; each tag can have at most one callback

enum Flag uint32_t VarNode & add_flag (Flag flag)
bool contain_flag(Flag flag) const
VarNode &shape_alloc(const TensorShape &shape, size_t size_req = 0)

set shape and alloc memory storage

This function should only be called by this var’s owner operator and this var must have NO_SYS_MEM_ALLOC flag; if shape does not increase and original tensor storage is valid, it is guaranteed that old data would be retained.

Warning

Alloc size_req memory if size_req != 0.

MGB_WARN_UNUSED_RESULT bool reset_dev_tensor_from_other_var (VarNode *src_var)

directly reset device tensor from another var

This function should only be called by this var’s owner operator and this var must have NO_SYS_MEM_ALLOC flag. It can be used to forward var values in the same graph or between graph. If both src_var and this var belong to same graph, memory forwarding may fail (e.g. when src_var is force updated by another opr)

Return

whether memory forwarding succeeds; if false is returned, a new tensor would be allocated and its value is copied from src

Parameters
  • src_var: the var node to provide dev tensor, which must have been initialized, and does not have to be in the same computing graph. Its value must be contiguous. It can also be placed on a different comp node.

VarNode &reset_dev_tensor_from_tensor(const DeviceTensorND &value)

directly reset device tensor from a given tensor

This function should only be called by this var’s owner operator and this var must have NO_SYS_MEM_ALLOC flag

Parameters
  • value: the tensor to be used; it must be contiguous and be placed on the same comp node of this var.

VarNode &add_rt_force_dynamic_mem_alloc_imply_chain(VarNode *dest)

add a var to add RT_FORCE_DYNAMIC_MEM_ALLOC flag if such flag is added to this var

The chains form a directed graph, and when a var is added RT_FORCE_DYNAMIC_MEM_ALLOC by VarNodeMemManager, all nodes in the connected component would be added with such flag.

This method should be called from OperatorNodeBase::init_rt_force_dynamic_mem_alloc_imply_chain impls.

MemAllocPlan &init_mem_plan(const DeviceTensorND *fixed_alloc = nullptr)

initialize mem plan as a uniquely owned contiguous chunk

this function should only be called from OperatorNodeBase::init_output_mem_plan and shape and comp_node must have been setup.

Parameters
  • fixed_alloc: if not null, it should be a tensor providing memory allocation for this var.

Friends

friend class static_infer::StaticInferManagerImpl
friend class imperative::ProxyGraph
friend class imperative::proxy_graph::ProxyGraph