Boolean arrays¶
# Import the array library
import numpy as np
We are building up to an answer to the three girls problem. At the end of the lists page, we found a way to simulate one family, and get a count.
Now we know about function arguments, and arrays, we can simplify the list version, like this:
# Make an array of four random integers that are either 0 or 1.
# 1 means a girl.
family = np.random.randint(0, 2, size=4)
family
array([1, 0, 0, 0])
# Add up the integers to count the number of girls.
count = np.sum(family)
count
1
Our interest in how whether of the count
value is equal to 3. We can look at
that number and write down “Yes” if the number is equal to 3 and “No”
otherwise, but we would like the computer to do that routine
work for us. We use comparison:
is_three = count == 3
is_three
False
True means our simulation found a family with three girls, and False means we found a family some other number of girls.
In a while, we are going to simulate a very large number of these families, but for now, let us simulate 5 families, in a somewhat laborious way:
# Make an array to store the counts for each family.
counts = np.zeros(5)
# Make five families, store the counts.
family = np.random.randint(0, 2, size=4)
counts[0] = np.sum(family)
# Second family
family = np.random.randint(0, 2, size=4)
counts[1] = np.sum(family)
# Third
family = np.random.randint(0, 2, size=4)
counts[2] = np.sum(family)
# Fourth
family = np.random.randint(0, 2, size=4)
counts[3] = np.sum(family)
# Fifth.
family = np.random.randint(0, 2, size=4)
counts[4] = np.sum(family)
# Show the counts
counts
array([3., 3., 1., 1., 2.])
Each value in counts
is the number of girls in one simulated family.
Now we have 5 numbers for which we want to ask the question - is this number equal to 3? We would like five corresponding True or False values.
Here is where arrays continue to work their magic - we can get this result with a single expression:
are_three = counts == 3
are_three
array([ True, True, False, False, False])
are_three
is an array with 5 elements, one for every element in the array we
compared, counts
.
are_three
is a Boolean array because it contains only Boolean (True, False) values.
We can see what kind of data the array contains by looking at the dtype
attribute of the array. Remember, an attribute is a value attached to another value. In this case it is a value attached to the are_three
value.
are_three.dtype
dtype('bool')
Each element in are_three
has the result of the comparison for the
corresponding element. The code above is equivalent to doing:
# Make an array of Boolean type (the "dtype" argument)
are_three_longhand = np.zeros(5, dtype=bool)
# Do the comparisons one by one.
are_three_longhand[0] = counts[0] == 3
are_three_longhand[1] = counts[1] == 3
are_three_longhand[2] = counts[2] == 3
are_three_longhand[3] = counts[3] == 3
are_three_longhand[4] = counts[4] == 3
# Show the result
are_three_longhand
array([ True, True, False, False, False])
Now we want to know how many of the counts
values are equal to 3. This is
the same as asking how many True values there are in are_three
(or
are_three_longhand
.
We can do this using the np.count_nonzero
function. It accepts an array as its argument, and returns the number of non-zero values in the array. It turns out that np.count_nonzero
treats True as non-zero, and False as zero, so np.count_nonzero
on a Boolean array counts the number of True values:
my_booleans = np.array([True, False, True])
np.count_nonzero(my_booleans)
2
To see the number of times we found 3 in counts
:
np.count_nonzero(are_three)
2