Random

Generate numbers and arrays from different random distributions.

Chapter Goals:

  • Learn about random operations in NumPy
  • Write code using the np.random submodule

A. Random integers

Similar to the Python random module, NumPy has its own submodule for pseudo-random number generation called np.random. It provides all the necessary randomized operations and extends it to multi-dimensional arrays. To generate pseudo-random integers, we use the np.random.randint function.

The code below shows example usages of np.random.randint.

Press + to interact
print(np.random.randint(5))
print(np.random.randint(5))
print(np.random.randint(5, high=6))
random_arr = np.random.randint(-3, high=14,
size=(2, 2))
print(repr(random_arr))

The np.random.randint function takes in a single required argument, which actually depends on the high keyword argument. If high=None (which is the default value), then the required argument represents the upper (exclusive) end of the range, with the lower end being 0. Specifically, if the required argument is n, then the random integer is chosen uniformly from the range [0, n).

If high is not None, then the required argument will represent the lower (inclusive) end of the range, while high represents the upper (exclusive) end.

The size keyword argument specifies the size of the output array, where each integer in the array is randomly drawn from the specified range. As a default, np.random.randint returns a single integer.

B. Utility functions

Some fundamental utility functions from the np.random module are np.random.seed and np.random.shuffle. We use the np.random.seed function to set the random seed, which allows us to control the outputs of the pseudo-random functions. The function takes in a single integer as an argument, representing the random seed.

The code below uses np.random.seed with the same random seed. Note how the outputs of the random functions in each subsequent run are identical when we set the same random seed.

Press + to interact
np.random.seed(1)
print(np.random.randint(10))
random_arr = np.random.randint(3, high=100,
size=(2, 2))
print(repr(random_arr))
# New seed
np.random.seed(2)
print(np.random.randint(10))
random_arr = np.random.randint(3, high=100,
size=(2, 2))
print(repr(random_arr))
# Original seed
np.random.seed(1)
print(np.random.randint(10))
random_arr = np.random.randint(3, high=100,
size=(2, 2))
print(repr(random_arr))

The np.random.shuffle function allows us to randomly shuffle an array. Note that the shuffling happens in place (i.e. no return value), and shuffling multi-dimensional arrays only shuffles the first dimension.

The code below shows example usages of np.random.shuffle. Note that only the rows of matrix are shuffled (i.e. shuffling along first dimension only).

Press + to interact
vec = np.array([1, 2, 3, 4, 5])
np.random.shuffle(vec)
print(repr(vec))
np.random.shuffle(vec)
print(repr(vec))
matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
np.random.shuffle(matrix)
print(repr(matrix))

C. Distributions

Using np.random we can also draw samples from probability distributions. For example, we can use np.random.uniform to draw pseudo-random real numbers from a uniform distribution.

The code below shows usages of np.random.uniform.

Press + to interact
print(np.random.uniform())
print(np.random.uniform(low=-1.5, high=2.2))
print(repr(np.random.uniform(size=3)))
print(repr(np.random.uniform(low=-3.4, high=5.9,
size=(2, 2))))

The function np.random.uniform actually has no required arguments. The keyword arguments, low and high, represent the inclusive lower end and exclusive upper end from which to draw random samples. Since they have default values of 0.0 and 1.0, respectively, the default outputs of np.random.uniform come from the range [0.0, 1.0).

The size keyword argument is the same as the one for np.random.randint, i.e. it represents the output size of the array.

Another popular distribution we can sample from is the normal (Gaussian) distribution. The function we use is np.random.normal.

The code below shows usages of np.random.normal.

Press + to interact
print(np.random.normal())
print(np.random.normal(loc=1.5, scale=3.5))
print(repr(np.random.normal(loc=-2.4, scale=4.0,
size=(2, 2))))

Like np.random.uniform, np.random.normal has no required arguments. The loc and scale keyword arguments represent the mean and standard deviation, respectively, of the normal distribution we sample from.

NumPy provides quite a few more built-in distributions, which are listed here.

D. Custom sampling

While NumPy provides built-in distributions to sample from, we can also sample from a custom distribution with the np.random.choice function.

The code below shows example usages of np.random.choice.

Press + to interact
colors = ['red', 'blue', 'green']
print(np.random.choice(colors))
print(repr(np.random.choice(colors, size=2)))
print(repr(np.random.choice(colors, size=(2, 2),
p=[0.8, 0.19, 0.01])))

The required argument for np.random.choice is the custom distribution we sample from. The p keyword argument denotes the probabilities given to each element in the input distribution. Note that the list of probabilities for p must sum to 1.

In the example, we set p such that 'red' has a probability of 0.8 of being chosen, 'blue' has a probability of 0.19, and 'green' has a probability of 0.01. When p is not set, the probabilities are equal for each element in the distribution (and sum to 1).

Time to Code!

Note: it is important you do all the instructions in the order listed. We test your code by setting a fixed np.random.seed, so in order for your code to match the reference output, all the functions must be run in the correct order.

We'll start off by obtaining some random integers. The first integer we get will be randomly chosen from the range [0, 5). The remaining integers will be part of a 3x5 NumPy array, each randomly chosen from the range [3, 10).

Set random1 equal to np.random.randint with 5 as the only argument.

Then set random_arr equal to np.random.randint with 3 as the first argument, 10 as the high keyword argument, and (3, 5) as the size keyword argument.

Press + to interact
# CODE HERE

The next two arrays will be drawn randomly from distributions. The first will contain 5 numbers drawn uniformly from the range [-2.5, 1.5].

Set random_uniform equal to np.random.uniform with the low and high keyword arguments set to -2.5 and 1.5, respectively. The size keyword argument should be set to 5.

Press + to interact
# CODE HERE

The second array will contain 50 numbers drawn from a normal distribution with mean 2.0 and standard deviation 3.5.

Set random_norm equal to np.random.normal with the loc and scale keyword arguments set to 2.0 and 3.5, respectively. The size keyword argument should be set to (10, 5).

Press + to interact
# CODE HERE

We'll now create our own distribution of strings and randomly select from it. The values for our distribution will be 'a', 'b', 'c', 'd'.

To choose a value, we'll use a probability distribution of [0.5, 0.1, 0.2, 0.2], i.e. 'a' will have probability 0.5 of being chosen, 'b' will have a probability of 0.1, etc.

Set choices equal to a list of the specified values, in the order given.

Set choice equal to np.random.choice with choices as the first argument and the specified probability distribution as the p keyword argument.

Press + to interact
# CODE HERE

The last random operation we perform will be an in-place shuffle of a NumPy array.

Set arr equal to a NumPy array containing the integers from 1 to 5, inclusive.

Then apply np.random.shuffle to arr.

Press + to interact
# CODE HERE

Get hands-on with 1300+ tech skills courses.