Rajeev Raizada
Rajeev Raizada

Reputation: 43

Q: Expected number of coin tosses to get N heads in a row, in Python. My code gives answers that don't match published correct ones, but unsure why

I'm trying to write Python code to see how many coin tosses, on average, are required to get a sequences of N heads in a row.

The thing that I'm puzzled by is that the answers produced by my code don't match ones that are given online, e.g. here (and many other places) https://math.stackexchange.com/questions/364038/expected-number-of-coin-tosses-to-get-five-consecutive-heads

According to that, the expected number of tosses that I should need to get various numbers of heads in a row are: E(1) = 2, E(2) = 6, E(3) = 14, E(4) = 30, E(5) = 62. But I don't get those answers! For example, I get E(3) = 8, instead of 14. The code below runs to give that answer, but you can change n to test for other target numbers of heads in a row.

What is going wrong? Presumably there is some error in the logic of my code, but I confess that I can't figure out what it is.

You can see, run and make modified copies of my code here: https://trinket.io/python/17154b2cbd

Below is the code itself, outside of that runnable trinket.io page. Any help figuring out what's wrong with it would be greatly appreciated!

Many thanks,

Raj P.S. The closest related question that I could find was this one: Monte-Carlo Simulation of expected tosses for two consecutive heads in python However, as far as I can see, the code in that question does not actually test for two consecutive heads, but instead tests for a sequence that starts with a head and then at some later, possibly non-consecutive, time gets another head.

# Click here to run and/or modify this code:
# https://trinket.io/python/17154b2cbd

import random
# n is  the target number of heads in a row
# Change the value of n, for different target heads-sequences
n = 3  
possible_tosses = [ 'h', 't' ]
num_trials = 1000
target_seq = ['h' for i in range(0,n)]
toss_sequence = []
seq_lengths_rec = []

for trial_num in range(0,num_trials):

    if (trial_num % 100) == 0:
        print 'Trial num', trial_num, 'out of', num_trials
        # (The free version of trinket.io uses Python2)

    target_reached = 0
    toss_num = 0

    while target_reached == 0:

        toss_num += 1
        random.shuffle(possible_tosses)
        this_toss = possible_tosses[0]
        #print([toss_num, this_toss])
        toss_sequence.append(this_toss)
        last_n_tosses = toss_sequence[-n:]
        #print(last_n_tosses)
    if last_n_tosses == target_seq:
        #print('Reached target at toss', toss_num)
        target_reached = 1
        seq_lengths_rec.append(toss_num)

print 'Average', sum(seq_lengths_rec) / len(seq_lengths_rec)

Upvotes: 4

Views: 1603

Answers (2)

Sandipan Dey
Sandipan Dey

Reputation: 23101

We cam eliminate one additional loop by running each experiment long enough (ideally infinite) number of times, e.g., each time toss a coin n=1000 times. Now, it is likely that the sequence of 5 heads will appear in each such trial. If it does appear, we can call the trial as an effective trial, otherwise we can reject the trial.

In the end, we can take an average of number of tosses needed w.r.t. the number of effective trials (by LLN it will approximate the expected number of tosses). Consider the following code:

N = 100000 # total number of trials
n = 1000 # long enough sequence of tosses
k = 5 # k heads in a row
ntosses = []
pat = ''.join(['1']*k)
effective_trials = 0
for i in range(N): # num of trials
    seq = ''.join(map(str,random.choices(range(2),k=n))) # toss a coin n times (long enough times)
    if pat in seq:
        ntosses.append(seq.index(pat) + k)
        effective_trials += 1
print(effective_trials, sum(ntosses) / effective_trials)
# 100000 62.19919

Notice that the result may not be correct if n is small, since it tries to approximate infinite number of coin tosses (to find expected number of tosses to obtain 5 heads in a row, n=1000 is okay since actual expected value is 62).

Upvotes: 1

shayelk
shayelk

Reputation: 1646

You don't re-initialize toss_sequence for each experiment, so you start every experiment with a pre-existing sequence of heads, having a 1 in 2 chance of hitting the target sequence on the first try of each new experiment.

Initializing toss_sequence inside the outer loop will solve your problem:

import random
# n is  the target number of heads in a row
# Change the value of n, for different target heads-sequences
n = 4
possible_tosses = [ 'h', 't' ]
num_trials = 1000
target_seq = ['h' for i in range(0,n)]
seq_lengths_rec = []

for trial_num in range(0,num_trials):

    if (trial_num % 100) == 0:
        print('Trial num {} out of {}'.format(trial_num, num_trials))
        # (The free version of trinket.io uses Python2)

    target_reached = 0
    toss_num = 0
    toss_sequence = []

    while target_reached == 0:

        toss_num += 1
        random.shuffle(possible_tosses)
        this_toss = possible_tosses[0]
        #print([toss_num, this_toss])
        toss_sequence.append(this_toss)
        last_n_tosses = toss_sequence[-n:]
        #print(last_n_tosses)
        if last_n_tosses == target_seq:
            #print('Reached target at toss', toss_num)
            target_reached = 1
            seq_lengths_rec.append(toss_num)

print(sum(seq_lengths_rec) / len(seq_lengths_rec))

You can simplify your code a bit, and make it less error-prone:

import random
# n is  the target number of heads in a row
# Change the value of n, for different target heads-sequences
n = 3
possible_tosses = [ 'h', 't' ]
num_trials = 1000
seq_lengths_rec = []

for trial_num in range(0, num_trials):

    if (trial_num % 100) == 0:
        print('Trial num {} out of {}'.format(trial_num, num_trials))
        # (The free version of trinket.io uses Python2)

    heads_counter = 0
    toss_counter = 0

    while heads_counter < n:
        toss_counter += 1

        this_toss = random.choice(possible_tosses)

        if this_toss == 'h':
            heads_counter += 1
        else:
            heads_counter = 0
    seq_lengths_rec.append(toss_counter)


print(sum(seq_lengths_rec) / len(seq_lengths_rec))

Upvotes: 5

Related Questions