NUMPY tutorial

Several numpy tutorials can be found here

Numpy¶

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays.

For MATLAB users, consider consulting the Numpy's NumPy for Matlab Users page.

Basics¶

Numpy’s main object is the multidimensional numpy array:
=> it is a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers
=> array dimensions are called axes
=> the number of dimensions is called the array rank
=> numpy's array class is called ndarray

Import the numpy package as follows:

In [2]:
import numpy as np

Arrays¶

Creation¶

from list/tuple¶

Create arrays from a sequence of values (Python list or tuple).

In [11]:
# create from a sequence of values (Python list or tuple):
a = np.array([0, 1, 2]) #>> create from list
a = np.array((0, 1, 2)) #>> create from tuple

print(type(a))
a
<class 'numpy.ndarray'>
Out[11]:
array([0, 1, 2])
In [26]:
# create multi-dimentional arrays from nested lists
nested_list = [[1,2,3],[4,5,6]]
a = np.array(nested_list)

a
Out[26]:
array([[1, 2, 3],
       [4, 5, 6]])
In [19]:
# specify data type upon creation
a = np.array([1, 2, 3], dtype='uint8')
print(a.dtype)

a = np.array([1, 2, 3], dtype='float32')
print(a.dtype)

a = np.array(['1', '2', '3'])
print(a.dtype)
uint8
float32
<U1

arange()¶

Create a 1D array of values, specifying the start, stop, step. The arange function is analogous to the Python built-in range, but returns an array.

In [24]:
a = np.arange(0, 10)
a
Out[24]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [33]:
a = np.arange(6)                    # 1d array
print(a)
[0 1 2 3 4 5]

Arange can be combined with the reshape() function to create multidimensional arrays:

In [34]:
b = np.arange(12).reshape(4, 3)     # 2d array (12 elements, arranged as 4 rows x 3 columns)
print(b)
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
In [66]:
# WARNING: in the following example, the 3D array is filled sequentially, parsing the values through the various channels 
# => does NOT fill a channel before going to the next) 
# => in this case the first channel has values 0, 4, 8, 12, 16, 20 (since there are 4 channels in array)

c = np.arange(24).reshape(2, 3, 4)  # 3d array (24 elements, arranged as 4 matrices of 2 rows x 3 columns)
print(c)

print(c[:,:,0])
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
[[ 0  4  8]
 [12 16 20]]

zeros()¶

Create an array of zeros, specifying the array shape as a tuple.

In [21]:
a = np.zeros((3, 4))
a
Out[21]:
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

ones()¶

Create an array of ones, specifying the array shape as a tuple.

In [22]:
a = np.ones((3, 4))
a
Out[22]:
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

full()¶

Create an array of unique values, specifying the array shape as a tuple.

In [67]:
a = np.full((2,2), 7)
print(a)
[[7 7]
 [7 7]]

eye()¶

Create the identity matrix, specifying the array shape as a tuple.

In [69]:
a = np.eye(2)
print(a)
[[1. 0.]
 [0. 1.]]

random()¶

Create an array of random floats in the range [0-1], specifying the array shape as a tuple.

In [73]:
a = np.random.random((2,2)) # Create an array filled with random values
print(a)
[[0.43048459 0.06487667]
 [0.87162144 0.11032227]]

empty()¶

Creates an array whose initial content is random and depends on the state of the memory.

In [108]:
a = np.empty((2,2))
a
Out[108]:
array([[0.51020597, 0.58265003],
       [0.49145982, 0.48806746]])

empty_like()¶

Create an empty array with the same shape as a.

In [107]:
a = np.ones((2,2))

b = np.empty_like(a)   # Create an empty matrix with the same shape as x
b
Out[107]:
array([[0.51020597, 0.58265003],
       [0.49145982, 0.48806746]])

Attributes¶

The most important attributes are shape and dtype. (See here for a more complete list).

In [44]:
# create 2D array
a = np.array([[1, 2, 3], [4, 5, 6]])
a
Out[44]:
array([[1, 2, 3],
       [4, 5, 6]])

shape¶

Returns the dimensions of the array as a tuple of integers:
(nb_rows, nb_columns) for 2D arrays, (nb_rows, nb_columns, nb_channels) for 3D arrays

In [31]:
a.shape
Out[31]:
(2, 3)

ndim¶

Returns the number of dimensions of the array.

In [46]:
a.ndim
Out[46]:
2
In [47]:
len(a.shape) # >> equivalent to asking for the length of the array shape
Out[47]:
2

dtype¶

Returns the type of the elements in the array.

In [32]:
a.dtype
Out[32]:
dtype('int64')

Indexing / slicing¶

Numpy offers indexing and slicing, similar to Python lists:

  • access elements using square brackets
  • Python is zero-based, meaning the first element is accessed with the index 0

However because arrays may be multidimensional, you must specify a slice for each dimension of the array:

  • array[rows, cols, channels]
In [ ]:
# --- REMINDER: indexing/slicing Python lists
l = [1, 2, 3, 4, 5] # create list

# - access single element
l[0]                # access first element
l[-1]               # access last element
l[-2]               # access second to last element

# - slice (access multiple elements)
l[1:3]              # access 2nd & 4th elements
l[1:-2]             # access 2nd until 2nd to last element
l[:3]               # access all elements from start until 4th element
l[3:]               # access all elements from 4th element until end
l[::2]              # access every nth element

# - assign element
l[0] = 0            # replace element
l[1:2] = [-1, -2]   # assign a sublist to a slice
In [42]:
# --- 1D numpy array
# => index/slice array just like a list

a = np.arange(6)    # create 1D array
print(a)
print(a[0])         # access first element
print(a[1:3])       # access 2nd & 4th elements
[0 1 2 3 4 5]
0
[1 2]
In [54]:
# --- 2D numpy array
# => specify a slice for each dimension of the array

b = np.arange(12).reshape(4, 3)     # create 2D array (12 elements, arranged as 4 rows x 3 columns)
print(b)

b[:, 0]       # access all row elements from the first column
b[-1, -2:]    # access the last 2 elements from the last row
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
Out[54]:
array([10, 11])
In [76]:
# --- 3D numpy array

# WARNING: in the following example, the 3D array is filled sequentially, parsing the values through the various channels 
# => does NOT fill a channel before going to the next) 
# => in this case the first channel has values 0, 4, 8, 12, 16, 20 (since there are 4 channels in array)

c = np.arange(24).reshape(2, 3, 4)  # create 3d array (24 elements, arranged as 4 matrices of 2 rows x 3 columns)
print(c)

c[:, :, 0]       # access all rows and columns from the first channel
c[..., 0]        # (equivalent to above command)
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
Out[76]:
array([[ 0,  4,  8],
       [12, 16, 20]])

Boolean indexing

Boolean array indexing: Boolean array indexing lets you pick out arbitrary elements of an array. Frequently this type of indexing is used to select the elements of an array that satisfy some condition. Here is an example:

In [ ]:
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)  # Find the elements of a that are bigger than 2;
                    # this returns a numpy array of Booleans of the same
                    # shape as a, where each slot of bool_idx tells
                    # whether that element of a is > 2.

print(bool_idx)
[[False False]
 [ True  True]
 [ True  True]]
In [ ]:
# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx])

# We can do all of the above in a single concise statement:
print(a[a > 2])
[3 4 5 6]
[3 4 5 6]

Shape manipulation¶

ravel()¶

Returns the array, flattened.

In [80]:
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)

a.ravel()
[[1 2 3]
 [4 5 6]]
Out[80]:
array([1, 2, 3, 4, 5, 6])

transpose¶

In [81]:
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)

a.T
[[1 2 3]
 [4 5 6]]
Out[81]:
array([[1, 4],
       [2, 5],
       [3, 6]])

stack¶

hstack()¶

Stack arrays horizontally.

In [88]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

c = np.hstack((a, b))
c
Out[88]:
array([1, 2, 3, 4, 5, 6])

vstack()¶

Stack arrays vertically.

In [89]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

c = np.vstack((a, b))
c
Out[89]:
array([[1, 2, 3],
       [4, 5, 6]])

dstack()¶

Stack arrays depth-wise.
(Useful to create RGB arrays from distinct R, G, B channels.)

In [93]:
R = np.zeros((3, 3))     # create 1st channel with 0
G = np.ones((3, 3))      # create 2nd channel with 1
B = np.full((3, 3), 10)  # create 3rd channel with 10
In [97]:
RGB = np.dstack((R, G, B))
RGB
Out[97]:
array([[[ 0.,  1., 10.],
        [ 0.,  1., 10.],
        [ 0.,  1., 10.]],

       [[ 0.,  1., 10.],
        [ 0.,  1., 10.],
        [ 0.,  1., 10.]],

       [[ 0.,  1., 10.],
        [ 0.,  1., 10.],
        [ 0.,  1., 10.]]])

reshape()¶

In [82]:
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)

a.reshape(3,2)
[[1 2 3]
 [4 5 6]]
Out[82]:
array([[1, 2],
       [3, 4],
       [5, 6]])

resize()¶

In [87]:
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)

a.resize((3, 2))
a
[[1 2 3]
 [4 5 6]]
Out[87]:
array([[1, 2],
       [3, 4],
       [5, 6]])

Datatypes¶

Every numpy array is a grid of elements of the same type.
You can explicitly specify the datatype when creating an array.

See numpy documentation about all numpy datatypes here.

In [99]:
x = np.array([1, 2])                  # dtype not specified => guessed by numpy => int
y = np.array([1.0, 2.0])              # dtype not specified => guessed by numpy => float
z = np.array([1, 2], dtype=np.uint8)  # dtype specified

print(x.dtype, y.dtype, z.dtype)
int64 float64 uint8

Array math¶

Arithmetic operators on arrays apply `elementwise`. A new array is created and filled with the result.
Basic mathematical functions are alse available as functions in the numpy module (ex: np.add(), etc.). See the full list of mathematical functions provided by numpy in the documentation.

Note: unlike MATLAB, * is elementwise multiplication, not matrix multiplication. Numpy uses instead the dot function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices.

element-wise operations¶

In [102]:
# --- create arrays
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)
In [101]:
# --- elementwise sum
print(x + y)
print(np.add(x, y)) #>> equivalent result
[[ 6  8]
 [10 12]]
[[ 6  8]
 [10 12]]
In [ ]:
# --- elementwise difference
print(x - y)
print(np.subtract(x, y)) #>> equivalent result
[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]
In [ ]:
# --- elementwise product
print(x * y)
print(np.multiply(x, y)) #>> equivalent result
[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]
In [103]:
# --- elementwise division
print(x / y)
print(np.divide(x, y)) #>> equivalent result
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
In [104]:
# --- elementwise square root
print(np.sqrt(x))
[[1.         1.41421356]
 [1.73205081 2.        ]]

dot product¶

In [100]:
# --- dot product 
# NB: unlike MATLAB, `*` is elementwise multiplication, not matrix multiplication. Numpy uses instead the `dot` function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices.

x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))
219
219

You can also use the @ operator which is equivalent to numpy's dot operator.

In [ ]:
print(v @ w)
219
In [ ]:
# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))
print(x @ v)
[29 67]
[29 67]
[29 67]
In [ ]:
# Matrix / matrix product; both produce the rank 2 array
print(x.dot(y))
print(np.dot(x, y))
print(x @ y)
[[19 22]
 [43 50]]
[[19 22]
 [43 50]]
[[19 22]
 [43 50]]

broadcasting¶

Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.

Example: add a constant vector to each row of a matrix

In [ ]:
# => add vector v to each row of the matrix x, storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x)   # Create an empty matrix with the same shape as x

# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
    y[i, :] = x[i, :] + v

print(y)
[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]

This works; however when the matrix x is very large, computing an explicit loop in Python could be slow. Note that adding the vector v to each row of the matrix x is equivalent to forming a matrix vv by stacking multiple copies of v vertically, then performing elementwise summation of x and vv. We could implement this approach like this:

In [ ]:
vv = np.tile(v, (4, 1))  # Stack 4 copies of v on top of each other
print(vv)                # Prints "[[1 0 1]
                         #          [1 0 1]
                         #          [1 0 1]
                         #          [1 0 1]]"
[[1 0 1]
 [1 0 1]
 [1 0 1]
 [1 0 1]]
In [ ]:
y = x + vv  # Add x and vv elementwise
print(y)
[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]

Copies and Views¶

When operating and manipulating arrays, their data is sometimes copied into a new array and sometimes not.

There are three cases.

no copy¶

Simple assignments make no copy of objects or their data.

In [132]:
a = np.ones((5, 5), dtype='uint8')

b = a            # no new object is created
b is a           # a and b are two names for the same ndarray object
Out[132]:
True

shallow copy: view()¶

Different array objects can share the same data. The view method creates a new array object that looks at the same data.

In [136]:
a = np.ones((5, 5), dtype='uint8')
a_shallow = a.view()          # make shallow copy: ``a_deep`` is a 'view' of the data owned by a

print(a_shallow is a)
print(a_shallow.base is a )
False
True
In [137]:
a_shallow[0, 0] = 100         # a's data changes
a
Out[137]:
array([[100,   1,   1,   1,   1],
       [  1,   1,   1,   1,   1],
       [  1,   1,   1,   1,   1],
       [  1,   1,   1,   1,   1],
       [  1,   1,   1,   1,   1]], dtype=uint8)
In [133]:
a_shallow = a_shallow.reshape((1, 25))  # a's shape doesn't change
a.shape
Out[133]:
(5, 5)

deep copy: copy()¶

The copy method makes a complete copy of the array and its data.

In [138]:
a = np.ones((5, 5), dtype='uint8')
a_deep = a.copy()        # a new array ``a_deep`` with new data is created

print(a_deep is a)
print(a_deep.base is a)  # a_deep doesn't share anything with a

a_deep[0, 0] = 9999      # changing the deep copy does not change the original variable
a
False
False
Out[138]:
array([[1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1]], dtype=uint8)
In [ ]:
del a  # the memory of ``a`` can be released

Printing¶

When you print an array, NumPy displays it in a similar way to nested lists.
(In the case of large arrays, it prints the first and last elements only.)

In [33]:
a = np.arange(6)                    # 1d array
print(a)
[0 1 2 3 4 5]
In [34]:
b = np.arange(12).reshape(4, 3)     # 2d array (12 elements, arranged as 4 rows, 3 columns)
print(b)
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
In [36]:
c = np.arange(24).reshape(2, 3, 4)  # 3d array (24 elements, arranged as 4 matrices of 2 rows and 3 columns)
print(c)
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]