Defined in File var_node.h
Node for a variable.
It must be the output of exactly one OperatorNode and may be input to other OperatorNode.
Each variable has an owner, the operator that generates this variable as one of the output.
VarNode class exposes most commonly used memory management interface
do not allocate memory by the system allocator even if shape could be inferred
do not allocate memory if value of this var is not used (i.e. VarReceiverInfo::value_used() returns false)
do not allocate memory statically (would be allocated dynamically if possible); useful if a var in subgraph would be directly forwarded to a var in owner graph (e.g. in case for LAST output mode in Loop)
do not reclaim memory if NO_SYS_MEM_ALLOC is set or this var has dynamic storage, memory would not be reclaimed after all readers are processed if this var has satic storage, its memory would not be reused by others
var node used as temporary storage, whose content should not be read by others
allow this var to have empty shape, which means it would not consume any memory and it has nullptr as the underlying pointer; vars without this flag set would trigger an error during memory allocation to avoid uninitialized output var shape. This flag should be set by the owner opr.
value is always available on device even before opr is executed (e.g. SharedDeviceTensor), so various optimizations can be performed
disallow RT_FORCE_DYNAMIC_MEM_ALLOC added to this node during memory optimization; this is only applicable when the operator manages memory of this var manually, and the memory is never reclaimed. Must be used with NO_MEM_RECLAIM.
disable sanity check for this VarNode this flag is added for swap_memory; SwapInMS opr works as a trigger to make its output VarNode start copying from host parallelly, when SwapInMS finishs execute(), it is likely that its output tensor does not have ‘exact’ content, so we need to disable var_sanity_check in this case
force dynamic memory allocation even if shape could be statically inferred; conflicts with NO_SYS_MEM_ALLOC
note that this is a runtime-flag, which would be cleared and re-evaluated on graph compiling; it is set up by VarNodeMemManager and propagated through
this flag indicates that the opr has been inserted into the graph and certain flags can not be modified. Only NO_MEM_RECLAIM, NO_SYS_STATIC_MEM_ALLOC and RT_FORCE_DYNAMIC_MEM_ALLOC flags can be added after FLAG_FREEZED is present.
this flag indicates that data of this var has been processed and no need later, it can be freed, this is used in weight preprocess for memory save
this constructor should only be called by OperatorNodeBase::add_output
implemented in core/impl/graph/operator_node.cpp
add a callback function to check the validity of a particular tensor layout
If callback returns true, it means that this VarNode’s dev_tensor with given layout may be forwarded to opr directly, otherwise it will be implicitly rearranged to a contiguous one.
requires the layout to be contiguous
Note: since many oprs require inputs to be contiguous, this is implemented by marking a flag on the var rather than adding a LayoutConstraintCallback to check whether it is contiguous. All the existing callbacks would be cleared and new callbacks would be ignored after add_layout_constraint_contiguous() is invoked.
requires the layout to be monotone while allowing broadcast
Note: similar to add_layout_constraint_contiguous() this is implemented by marking a flag; however user-defined callbacks are still invoked since they might impose stronger constraints.
MGB_WARN_UNUSED_RESULT bool set_fwd_in2out_readonly (VarNode *input, const SubTensorSpec &sub)
request that memory should be readonly forwarded from other var
Note that this function must be called from OperatorNodeBase::mem_plan_fwd_in2out_readonly.
whether this request could be satisfied
request that this var share memory with another var, whose content would also be modified
Note that this function must be called from OperatorNodeBase::mem_plan_fwd_in2out_writable.
require this var to share memory from another var; only used for operators that have an explicit updating semantics
Note that this function must be called during operator node initialization
get name; if name is not valid, get name of owner opr
get name as C-string
whether name is explicitly set,
set name explicitly
get data type of data in this var
get tensor format in this var
set dtype; this function can only be called once
set format; this function can only be called once
get the underlying device tensor to fill data
get the underlying device tensor that can be modified(like resize())
This should only be called from the owner opr of this var, and this var must have flag NO_SYS_MEM_ALLOC.
previous dev ptr before deallocating dev_tensor; used for testing and debugging
get the comp node on which this var is computed
set comp node; only the memory node could be changed if called multiple times
get current reference count; not thread safe, and only used for testing purposes
reset VarNode shape
whether shape differs from old shape
add a callback to be executed when shape of this var is updated
tag: callback tag; each tag can have at most one callback
enum Flag uint32_t VarNode & add_flag (Flag flag)
set shape and alloc memory storage
This function should only be called by this var’s owner operator and this var must have NO_SYS_MEM_ALLOC flag; if shape does not increase and original tensor storage is valid, it is guaranteed that old data would be retained.
Alloc size_req memory if size_req != 0.
MGB_WARN_UNUSED_RESULT bool reset_dev_tensor_from_other_var (VarNode *src_var)
directly reset device tensor from another var
This function should only be called by this var’s owner operator and this var must have NO_SYS_MEM_ALLOC flag. It can be used to forward var values in the same graph or between graph. If both src_var and this var belong to same graph, memory forwarding may fail (e.g. when src_var is force updated by another opr)
whether memory forwarding succeeds; if false is returned, a new tensor would be allocated and its value is copied from src
src_var: the var node to provide dev tensor, which must have been initialized, and does not have to be in the same computing graph. Its value must be contiguous. It can also be placed on a different comp node.
directly reset device tensor from a given tensor
This function should only be called by this var’s owner operator and this var must have NO_SYS_MEM_ALLOC flag
value: the tensor to be used; it must be contiguous and be placed on the same comp node of this var.
add a var to add RT_FORCE_DYNAMIC_MEM_ALLOC flag if such flag is added to this var
The chains form a directed graph, and when a var is added RT_FORCE_DYNAMIC_MEM_ALLOC by VarNodeMemManager, all nodes in the connected component would be added with such flag.
This method should be called from OperatorNodeBase::init_rt_force_dynamic_mem_alloc_imply_chain impls.
initialize mem plan as a uniquely owned contiguous chunk
this function should only be called from OperatorNodeBase::init_output_mem_plan and shape and comp_node must have been setup.
fixed_alloc: if not null, it should be a tensor providing memory allocation for this var.