5.3 Iteration with For loops

Download notebook Interact

Iteration

It is often the case in programming – especially when dealing with randomness – that we want to repeat a process multiple times.

Consider the numpy function random.choice. It claims to choose randomly between the elements on an array that we pass it. Here we make an array of strings containing two choices:

coin = np.array(['Heads', 'Tails'])

We use np.random.choice to choose randomly between these two elements:

np.random.choice(coin)
'Tails'

We might want to check whether np.random.choice does in fact pick either option with about the same probability. To do that, we could start by runing the following cell many times to see if we get roughly equal numbers of "Heads" and "Tails".

np.random.choice(coin)
'Tails'

We might want to re-run code with slightly different input or other slightly different behavior. We could copy-paste the code multiple times, but that’s tedious and prone to typos, and if we wanted to do it a thousand times or a million times, forget it.

A more automated solution is to use a for statement to loop over the contents of a sequence. This is called iteration. A for statement begins with the word for, followed by a name we want to give each item in the sequence, followed by the word in, and ending with an expression that evaluates to a sequence. The indented body of the for statement is executed once for each item in that sequence.

for i in np.arange(3):
    print(i)
0
1
2

It is instructive to imagine code that exactly replicates a for statement without the for statement. (This is called unrolling the loop.) A for statement simple replicates the code inside it, but before each iteration, it assigns a new value from the given sequence to the name we chose. For example, here is an unrolled version of the loop above:

i = np.arange(3)[0]
print(i)
i = np.arange(3)[1]
print(i)
i = np.arange(3)[2]
print(i)
0
1
2

Notice that the name i is arbitrary, just like any name we assign with =. For example, the following for loop works in just the same way as the for loop above:

for my_variable in np.arange(3):
    print(my_variable)
0
1
2

In the next example, we use a for statement in a more realistic way: we print the results of five random choices:

for i in np.arange(5):
    print(np.random.choice(coin))
Heads
Tails
Tails
Heads
Heads

In this case, we simply perform exactly the same (random) action several times, so the code inside our for statement does not actually refer to the variable i.

Collecting results with lists

While the for statement above does simulate the results of five tosses of a coin, the results are simply printed and aren’t in a form that we can use for computation. Thus a typical use of a for statement is to create an array of results, by augmenting it each time.

We can use the append method of a list do do this. As you saw in the page on lists, we can use the append method of a list to append a value to the list.

# A list with two strings
pets = ['Cat', 'Dog']
pets.append('Rabbit')
pets
['Cat', 'Dog', 'Rabbit']

Example: Counting the Number of Heads

We can now simulate five tosses of a coin and place the results into an array. We will start by creating an empty array and then appending the result of each toss.

# An empty list
tosses = []

for i in np.arange(5):
    tosses.append(np.random.choice(coin))

tosses
['Heads', 'Heads', 'Tails', 'Tails', 'Tails']

Let us rewrite the cell with the for statement unrolled:

# An empty list
tosses = []

i = np.arange(5)[0]
tosses.append(np.random.choice(coin))
i = np.arange(5)[1]
tosses.append(np.random.choice(coin))
i = np.arange(5)[2]
tosses.append(np.random.choice(coin))
i = np.arange(5)[3]
tosses.append(np.random.choice(coin))
i = np.arange(5)[4]
tosses.append(np.random.choice(coin))

tosses
['Tails', 'Tails', 'Heads', 'Heads', 'Heads']

We have captured the results in an list, but we want to give ourselves the ability to use array methods to do computations. To do this, we convert the list into an array:

toss_arr = np.array(tosses)

Now we have an array, we can use np.count_nonzero to count the number of heads in the five tosses.

np.count_nonzero(toss_arr == 'Heads')
3

Iteration is a powerful technique. For example, by running exactly the same code for 1000 tosses instead of 5, we can count the number of heads in 1000 tosses.

tosses = []

for i in np.arange(1000):
    tosses.append(np.random.choice(coin))

toss_arr = np.array(tosses)
np.count_nonzero(toss_arr == 'Heads')
482

Example: Number of Heads in 100 Tosses

It is natural to expect that in 100 tosses of a coin, there will be 50 heads, give or take a few.

But how many is “a few”? What’s the chance of getting exactly 50 heads? Questions like these matter in data science not only because they are about interesting aspects of randomness, but also because they can be used in analyzing experiments where assignments to treatment and control groups are decided by the toss of a coin.

In this example we will simulate 10,000 repetitions of the following experiment:

  • Toss a coin 100 times and record the number of heads.

The histogram of our results will give us some insight into how many heads are likely.

As a preliminary, note that np.random.choice takes an optional second argument that specifies the number of choices to make. By default, the choices are made with replacement, meaning that there is the same chance of getting a Head for all the choices returned. Here is a simulation of 10 tosses of a coin:

np.random.choice(coin, 10)
array(['Heads', 'Tails', 'Heads', 'Tails', 'Tails', 'Tails', 'Tails',
       'Heads', 'Tails', 'Tails'], dtype='<U5')

Now let’s study 100 tosses. We will start by creating an empty array called heads. Then, in each of the 10,000 repetitions, we will toss a coin 100 times, count the number of heads, and append it to heads.

N = 10000

head_counts = []

for i in np.arange(N):
    tosses = np.random.choice(coin, 100)
    n_heads = np.count_nonzero(tosses == 'Heads')
    head_counts.append(n_heads)

head_counts
[58,
 50,
 47,
 45,
 56,
 58,
 49,
 42,
 52,
 50,
 49,
 50,
 59,
 49,
 45,
 42,
 49,
 52,
 47,
 55,
 58,
 56,
 49,
 51,
 46,
 51,
 54,
 46,
 50,
 49,
 57,
 52,
 51,
 49,
 45,
 45,
 50,
 55,
 52,
 48,
 44,
 59,
 52,
 53,
 63,
 49,
 39,
 48,
 50,
 59,
 53,
 47,
 48,
 55,
 48,
 50,
 50,
 56,
 50,
 55,
 49,
 55,
 54,
 57,
 56,
 52,
 50,
 46,
 53,
 52,
 49,
 59,
 57,
 48,
 43,
 48,
 51,
 46,
 56,
 44,
 51,
 44,
 50,
 55,
 49,
 44,
 53,
 54,
 50,
 48,
 47,
 48,
 45,
 51,
 53,
 56,
 46,
 43,
 53,
 44,
 51,
 52,
 46,
 57,
 46,
 53,
 45,
 47,
 54,
 48,
 51,
 59,
 57,
 59,
 53,
 46,
 47,
 46,
 43,
 48,
 61,
 49,
 53,
 55,
 51,
 58,
 55,
 43,
 62,
 39,
 56,
 52,
 48,
 54,
 48,
 47,
 52,
 52,
 50,
 57,
 49,
 51,
 49,
 47,
 49,
 56,
 48,
 54,
 47,
 56,
 43,
 61,
 48,
 54,
 49,
 44,
 56,
 49,
 51,
 55,
 50,
 51,
 50,
 50,
 54,
 50,
 51,
 47,
 37,
 57,
 55,
 46,
 44,
 48,
 47,
 57,
 58,
 53,
 43,
 45,
 55,
 44,
 54,
 48,
 58,
 50,
 53,
 55,
 46,
 46,
 48,
 50,
 53,
 50,
 53,
 55,
 51,
 50,
 62,
 47,
 55,
 56,
 57,
 46,
 55,
 43,
 47,
 42,
 49,
 51,
 53,
 46,
 48,
 51,
 50,
 43,
 48,
 44,
 48,
 50,
 54,
 41,
 50,
 45,
 50,
 48,
 56,
 51,
 54,
 52,
 55,
 52,
 55,
 49,
 42,
 48,
 46,
 55,
 49,
 54,
 45,
 45,
 41,
 53,
 49,
 51,
 46,
 48,
 62,
 50,
 44,
 47,
 53,
 48,
 55,
 49,
 45,
 53,
 52,
 45,
 53,
 41,
 55,
 45,
 54,
 51,
 57,
 40,
 51,
 48,
 59,
 53,
 52,
 54,
 46,
 56,
 47,
 48,
 52,
 45,
 45,
 55,
 43,
 43,
 37,
 51,
 54,
 47,
 53,
 59,
 51,
 48,
 50,
 54,
 48,
 46,
 56,
 53,
 58,
 52,
 54,
 44,
 51,
 46,
 50,
 47,
 51,
 57,
 56,
 58,
 49,
 53,
 47,
 46,
 47,
 51,
 48,
 46,
 49,
 53,
 63,
 52,
 55,
 48,
 50,
 49,
 43,
 43,
 53,
 49,
 39,
 48,
 42,
 49,
 52,
 50,
 53,
 48,
 51,
 47,
 55,
 56,
 46,
 49,
 48,
 47,
 49,
 52,
 59,
 55,
 49,
 48,
 54,
 40,
 51,
 56,
 53,
 42,
 45,
 46,
 51,
 52,
 44,
 46,
 58,
 65,
 49,
 61,
 49,
 41,
 52,
 61,
 56,
 46,
 53,
 49,
 57,
 41,
 46,
 58,
 46,
 49,
 34,
 50,
 46,
 53,
 57,
 52,
 47,
 59,
 47,
 60,
 50,
 49,
 47,
 50,
 59,
 51,
 47,
 49,
 57,
 51,
 51,
 53,
 51,
 48,
 55,
 46,
 47,
 51,
 59,
 52,
 49,
 44,
 47,
 48,
 47,
 50,
 50,
 56,
 49,
 51,
 48,
 52,
 52,
 47,
 46,
 51,
 48,
 47,
 41,
 40,
 47,
 47,
 51,
 53,
 51,
 47,
 53,
 47,
 50,
 44,
 52,
 62,
 44,
 50,
 51,
 48,
 56,
 47,
 43,
 59,
 38,
 53,
 49,
 47,
 45,
 53,
 48,
 56,
 48,
 51,
 56,
 47,
 41,
 48,
 38,
 38,
 54,
 49,
 53,
 48,
 52,
 50,
 56,
 48,
 47,
 54,
 48,
 50,
 53,
 37,
 43,
 49,
 44,
 50,
 51,
 53,
 47,
 53,
 54,
 47,
 46,
 49,
 48,
 59,
 57,
 47,
 54,
 50,
 53,
 45,
 47,
 53,
 55,
 47,
 52,
 51,
 47,
 49,
 63,
 48,
 55,
 45,
 50,
 49,
 53,
 52,
 41,
 48,
 37,
 44,
 45,
 47,
 56,
 53,
 56,
 49,
 45,
 50,
 53,
 51,
 55,
 51,
 47,
 56,
 49,
 46,
 49,
 53,
 51,
 50,
 47,
 41,
 52,
 49,
 51,
 45,
 54,
 53,
 52,
 50,
 45,
 52,
 55,
 53,
 48,
 49,
 59,
 50,
 42,
 56,
 57,
 50,
 49,
 47,
 52,
 59,
 41,
 50,
 45,
 51,
 41,
 49,
 56,
 47,
 33,
 43,
 46,
 48,
 49,
 54,
 52,
 49,
 43,
 49,
 55,
 45,
 55,
 47,
 56,
 48,
 47,
 54,
 55,
 37,
 53,
 52,
 55,
 55,
 47,
 52,
 53,
 57,
 53,
 47,
 40,
 49,
 59,
 58,
 49,
 51,
 47,
 52,
 46,
 50,
 49,
 52,
 48,
 44,
 52,
 54,
 47,
 55,
 62,
 44,
 48,
 55,
 52,
 58,
 50,
 55,
 50,
 48,
 56,
 49,
 59,
 45,
 59,
 49,
 47,
 52,
 52,
 51,
 59,
 48,
 47,
 54,
 51,
 52,
 49,
 49,
 53,
 52,
 47,
 48,
 51,
 46,
 53,
 56,
 57,
 40,
 53,
 58,
 56,
 55,
 47,
 60,
 45,
 54,
 53,
 48,
 49,
 47,
 50,
 47,
 45,
 50,
 46,
 45,
 48,
 57,
 51,
 52,
 56,
 57,
 56,
 58,
 50,
 42,
 48,
 52,
 52,
 53,
 55,
 49,
 48,
 50,
 57,
 54,
 48,
 43,
 57,
 43,
 51,
 53,
 51,
 59,
 44,
 49,
 52,
 51,
 44,
 53,
 49,
 51,
 54,
 58,
 51,
 50,
 42,
 46,
 44,
 46,
 55,
 50,
 47,
 41,
 44,
 47,
 46,
 58,
 47,
 45,
 63,
 44,
 50,
 51,
 45,
 50,
 55,
 48,
 57,
 57,
 45,
 56,
 51,
 43,
 49,
 45,
 46,
 55,
 42,
 48,
 38,
 48,
 42,
 54,
 53,
 51,
 54,
 45,
 46,
 52,
 50,
 41,
 44,
 50,
 56,
 46,
 58,
 49,
 56,
 50,
 41,
 43,
 56,
 54,
 52,
 49,
 46,
 50,
 59,
 54,
 43,
 51,
 49,
 49,
 48,
 49,
 54,
 52,
 52,
 53,
 49,
 44,
 44,
 58,
 51,
 41,
 52,
 48,
 55,
 53,
 50,
 57,
 54,
 52,
 46,
 42,
 45,
 56,
 49,
 59,
 44,
 45,
 44,
 53,
 52,
 49,
 51,
 40,
 38,
 43,
 51,
 53,
 50,
 52,
 49,
 44,
 46,
 55,
 47,
 43,
 49,
 53,
 50,
 50,
 49,
 53,
 55,
 45,
 40,
 47,
 46,
 54,
 57,
 46,
 54,
 57,
 40,
 46,
 48,
 47,
 54,
 45,
 48,
 60,
 49,
 56,
 47,
 50,
 56,
 50,
 56,
 51,
 59,
 50,
 46,
 43,
 56,
 53,
 50,
 57,
 50,
 56,
 58,
 47,
 55,
 54,
 48,
 45,
 50,
 45,
 48,
 48,
 41,
 47,
 45,
 53,
 56,
 58,
 55,
 49,
 45,
 42,
 55,
 54,
 53,
 50,
 46,
 49,
 50,
 58,
 48,
 55,
 42,
 54,
 54,
 50,
 55,
 54,
 62,
 40,
 54,
 52,
 51,
 48,
 45,
 50,
 54,
 49,
 59,
 59,
 53,
 42,
 52,
 52,
 56,
 55,
 52,
 47,
 46,
 56,
 49,
 52,
 46,
 57,
 49,
 40,
 50,
 50,
 51,
 56,
 46,
 53,
 41,
 57,
 44,
 47,
 59,
 50,
 55,
 43,
 54,
 51,
 37,
 53,
 49,
 55,
 54,
 49,
 44,
 49,
 60,
 51,
 55,
 50,
 48,
 47,
 53,
 41,
 40,
 45,
 46,
 44,
 49,
 37,
 47,
 60,
 52,
 49,
 46,
 46,
 40,
 46,
 46,
 49,
 62,
 34,
 59,
 48,
 53,
 54,
 54,
 59,
 57,
 52,
 45,
 52,
 ...]

Here is a histogram of the data, with bins of width 1 centered at each value of the number of heads.

plt.hist(head_counts, bins=np.arange(30.5, 69.6, 1));

png

Not surprisingly, the histogram looks roughly symmetric around 50 heads. The height of the bar at 50 is about 8% per unit. Since each bin is 1 unit wide, this is the same as saying that about 8% of the repetitions produced exactly 50 heads. That’s not a huge percent, but it’s the largest compared to the percent at every other number of heads.

The histogram also shows that in almost all of the repetitions, the number of heads in 100 tosses was somewhere between 35 and 65. Indeed, the bulk of the repetitions produced numbers of heads in the range 45 to 55.

While in theory it is possible that the number of heads can be anywhere between 0 and 100, the simulation shows that the range of probable values is much smaller.

This is an instance of a more general phenomenon about the variability in coin tossing, as we will see later in the course.

Now see the for loop exercises.

This page has content from the Iteration notebook from the UC Berkeley course. See the Berkeley course section of the license