Tensor element index¶
See also
Before reading this part, you need to know how to Access an element in Tensor and Use slices to get some elements.
Note
The following is noted in this section referred to in shorthand:
Slicing in MegEngine will return new objects (rather than sharing the same piece of memory), and the slicing operation will not reduce the Tensor dimension;
The index syntax of multi-dimensional Tensor is like ``a(i, j)’’, and slice syntax like ``a(i:j, p:q]’’ is also supported;
You can use the ellipsis
...'' to automatically fill in the complete slice to the remaining dimensions, for example, ``a[i, ...]'' is equivalent to ``a[i, :, :]
.
Compared with NumPy index¶
Attention NumPy users!
Some concepts and designs existing in NumPy cannot be directly applied to MegEngine.
See also
In MegEngine, if you want Access an element in Tensor, you can use the standard x[obj]
syntax. It seems that everything is very similar to NumPy. The official documentation of the latter also explains the various indexing methods :class: <https://numpy.org/doc/stable/reference/arrays.indexing.html>`_. However, MegEngine’s Tensor implementation is still slightly different from NumPy. If you don’t know some details, you may not be able to explain some phenomena.
Indexed objects are different¶
MegEngine
>>> x = Tensor([[1., 2.], [3., 4.]])
>>> y = x[0]
>>> y[1] = 6
>>> x
Tensor([[1. 2.]
[3. 4.]], device=xpux:0)
NumPy
>>> x = array([[1., 2.], [3., 4.]])
>>> y = x[0]
>>> y[1] = 6
>>> x
array([[1., 6.],
[3., 4.]])
The reason for this is that when using indexing in NumPy, you get the View (View) of the original array. Changing the elements in the view, the elements in the original array will also change-this is a place where many NumPy users are prone to confusion when they learn. In MegEngine, there is no view view
attribute. The element or sub-Tensor obtained by indexing or slicing and the original Tensor occupy a different memory area.
In some designs in other places, the two are still the same, and we will introduce them next.
Slice index does not reduce dimensionality¶
Both MegEngine and NumPy will not change the object when slicing Number of dimensions :
>>> M = Tensor([[1, 2, 3],
... [4, 5, 6],
... [7, 8, 9]])
>>> M[1:2][0:1]
Tensor([[4 5 6]], dtype=int32, device=cpux:0)
>>> M[1:2][0:1].ndim
2
In the whole process, the slice obtained is a Tensor with ``ndim=2’’.
The result of executing
M[1:2]
is[[4, 5, 6])'' instead of ``[4, 5, 6]
.To slice
[[4, 5, 6]]
with[0:1]
, the result is still[[4, 5, 6]]
.
The wrong understanding may be like this:
The result of executing
M[1:2]
is[4, 5, 6]
. —— Wrong! Slicing will not reduce dimensionality!Slice
[0:1]
on[4, 5, 6]
, and get4
. —— Dimensionality reduction, so it’s not right!
Note
The function of the slice is to take a part from the whole, so it will not produce the behavior of reducing the dimensionality.
If you want to remove redundant dimensions after slicing, you can use
squeeze
.
Can use array index¶
In addition to actually slice index, we can also use an integer array element index to get a particular position, in one dimension, for example:
MegEngine
>>> x = Tensor([1., 2., 3.])
>>> y = x[[0, 2]]
>>> y
Tensor([1. 3.], device=xpux:0)
NumPy
>>> x = array([1., 2., 3.])
>>> y = x[[0, 2]]
>>> y
array([1., 3.])
The length of the index array corresponds to the number of elements to be indexed, in some cases this mechanism is very helpful.
At this time, NumPy will not generate a view of the original array, which is consistent with the logic of MegEngine.
Warning
Pay attention to the grammatical details, some users are easy to write integer array index as the following form:
>>> x = Tensor([1., 2., 3.])
>>> y = x[0, 1, 2]
IndexError: too many indices for tensor: tensor is 1-dimensional, but 3 were indexed
In fact, this is the syntax for indexing the n dimensions of Tensor separately. Leads to the explanation of the next section-
Index in multiple dimensions¶
Take the following Tensor represented by the matrix (2-dimensional array) \(M\) as an example:
Although we can use[1][2]`` to get the value of 6, but the efficiency is not high (refer to Access an element in Tensor).
Note
Python sequence types are built one-dimensional, and therefore support only the index, but have a multidimensional Tensor properties, it may be indexed directly in a plurality of dimensions (or :ref:’slicing in a plurality of dimensions <multi-dim-slicing>’, the latter will For example), use
,'' as the separation between dimensions. In the above example, you can use ``M[1, 2]
to access elements, instead of using multiple square bracketsM[1][2]
.Interested users can understand the details behind the try.:To correctly handle this form of
[]
operator in Python, the special methods of objects__getitem__
and__setitem__
need to be tuples To accept the incoming index. In other words, if you want to get the value ofM[i, j]
, Python will actually callM.__getitem__((i, j))
.
>>> M = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> M[1,2]
Tensor(6, dtype=int32, device=xpux:0)
It can be understood that the element is directly accessed at the position where the index value of the 0th axis is 1, and the index value of the 1st axis is 2.
To generalize to the general case, when accessing \(T\)), the following syntax can be used::
That is, we need to provide \(i_1, i_2, \ldots, i_n\). At this time, there is no need to reduce the dimensionality index, but to get the corresponding element directly.
If the number of index arrays provided is less than n, you need to know Default condition of multidimensional index.
Slice in multiple dimensions¶
Note
Indexing on a certain dimension, in addition to indexing specific elements, you can also perform slicing operations to obtain specific parts of the elements.
Since we can index in multiple dimensions, naturally, we can slice from multiple dimensions;
The problem is that users tend to ignore Slice index does not reduce dimensionality, especially when used with multiple
[]
.
Now we need to cut out the blue part of the element:
Some people will write it as ``M[1:3][0:2]’’, and you will get unexpected results at this time:
>>> M[1:3][0:2]
Tensor([[4 5 6]
[7 8 9]], dtype=int32, device=xpux:0)
This is because the ``[]’’ operation is explained in order, and the logical order behind it is:
>>> T = M[1:3]
>>> T
Tensor([[4 5 6]
[7 8 9]], dtype=int32, device=xpux:0)
>>> T[0:2]
Tensor([[4 5 6]
[7 8 9]], dtype=int32, device=xpux:0)
Warning
Since the slicing operation does not reduce the dimensionality, the above writing is equivalent to slicing at axis=0 every time.
See also
If you are not clear about the concept of axis, you can refer to Tensor axis.
The correct approach is like Index in multiple dimensions, the use of` , `to distinguish dimensions:
>>> M[1:3, 0:2]
Tensor([[4 5]
[7 8]], dtype=int32, device=xpux:0)
It can be understood as using the ``1:3’’ slice on the 0th axis and the ``0:2’’ slice on the 1st axis, and find their intersection:
Generalizing to the general case, when accessing \(T\)), the following syntax is required::
That is, we have to provide \(s_1, s_2, \ldots ,s_n\) a total of n slices, and each slice is for a specific dimension.
If the number of slices provided is less than n, you need to know Default condition of multidimensional index.
Note
In multi-[obj]’’ is composed of slices of different dimensions.
See also
For ``ndim’’ especially large Tensor (assuming more than 1000 dimensions), sometimes we only want to index a certain axis, or perform specific operations, at this time we can use: py:func:~.functional.gather `or:py:func:`~.functional.scatter
These two methods correspond to: py:func:numpy.take_along_axis and: py:func:numpy.put_along_axis
Use ellipsis when multidimensional slicing¶
When multi-dimensional slicing of Tensor, it is allowed to omit (Ellipsis) representation for some dimensions that are not sliced. It is correctly written as three English periods ...'' instead of the half ellipsis ``…U+2026 Python parser treats ``...'' as a symbol Just like the ``start:end:step
symbol can represent a slice object, the ellipsis is actually <https://docs.python.org/dev/library/constants.html#Ellipsis>`_ object, which is used to insert as many complete slices as possible at this position``: `` to extend the slicing syntax to all dimensions.
For example, if T
is a 4-dimensional Tensor, then there is:
T[i, ...]'' is an abbreviation of ``T[i, :, :, :]
;T[..., i]
is an abbreviation ofT[:, :, :, i]
;T[i, ..., j]'' is an abbreviation of ``T[i, :, :, j]
.
Default condition of multidimensional index¶
If the index number given when indexing a multi-dimensional Tensor is less than the actual dimension ``ndim’’, you will get a sub-Tensor:
>>> M[2]
Tensor([7 8 9], dtype=int32, device=xpux:0)
>>> M[2,:]
Tensor([7 8 9], dtype=int32, device=xpux:0)
>>> M[:,2]
Tensor([3 6 9], dtype=int32, device=xpux:0)
At this time, the elements of other dimensions will be completely retained, which is equivalent to using ``:’’ as the default index of the default dimension;
According to the given number of explicit indexes, the number of subTensor dimensions obtained will be correspondingly reduced.
Advanced indexing method¶
See also
Refer to NumPy Advanced Indexing.