\(\newcommand{L}[1]{\| #1 \|}\newcommand{VL}[1]{\L{ \vec{#1} }}\newcommand{R}[1]{\operatorname{Re}\,(#1)}\newcommand{I}[1]{\operatorname{Im}\, (#1)}\)

Spread of a distribution with number of samples

Have a look back at Testing Brexit Proportions. We found that our sampling distribution of 1315 simulated voters with p=0.519 of voting Brexit was not compatible with the survey proportion of 0.41.

We’re going to be doing some plots, so we start of by getting the plotting libraries:

Hint

If running in the IPython console, consider running %matplotlib to enable interactive plots. If running in the Jupyter Notebook, use %matplotlib inline.

>>> #: The stuff you need for plotting a histogram
>>> import matplotlib.pyplot as plt

Let’s have a look again at the sampling distribution:

>>> #: The random module
>>> import random
>>> #: function to return a Leave or Remain voter
>>> def leave_or_remain():
...     # Return 1 for Leave, 0 for Remain
...     random_no = random.random()
...     if random_no < 0.519:
...         our_result = 1
...     else:
...         our_result = 0
...     return our_result

Here we get 1315 voters, and calculate a statistic, the proportion of Brexit voters:

>>> #: The statistic value from a single trial
>>> def one_proportion():
...     votes = []
...     for i in range(1315):
...         vote = leave_or_remain()
...         votes.append(vote)
...     brexits = sum(votes)
...     return brexits / len(votes)

Now we want the sampling distribution of this proportion:

>>> n_trials = 10000
>>> proportions = []
>>> for i in range(n_trials):
...     proportion = one_proportion()
...     proportions.append(proportion)

Plot the histogram of the sampling distribution you have just collected:

>>> #- plot the histogram of the sampling distribution
>>> plt.hist(proportions)
(...)

(png, hires.png, pdf)

_images/samples_and_spread-7.png

Now - go back to one_proportion above, and change 1315 to 20. Plot the sampling distribution again, by running the cells above. What do you see? Try 50, 100 and 1000 instead of 20.