Reputation: 39
Below python code represents a small 'game':
I know the solution is that there are about 0.8 streaks per sequence, but I get to a number of 1.6 and I cannot figure out what I am doing wrong... I have obviously seen other solutions, but I would like to figure out how I can make this specific code work.
Could you have a look at the below code and let me know what I am doing wrong?
import random
numberOfStreaks = 0
possib = ['H', 'T']
folge = ''
x = 0
while x < 10000:
for i in range (100):
folge = folge + str(random.choice(possib))
numberOfStreaks = folge.count('TTTTTT') + folge.count('HHHHHH')
x = x + 1
print(numberOfStreaks)
Upvotes: 0
Views: 975
Reputation: 71424
You're appending to folge
each time through the x
loop, so the 10000 different runs aren't independent of one another -- you don't have 10000 different sets of 100 tosses, you have a single set of 1000000 tosses (which is going to have slightly more streaks in it since you aren't "breaking" it after 100 tosses).
What you want to do is count the streaks for each set of 100 tosses, and then take the mean of all those counts:
from random import choice
from statistics import mean
def count_streaks(folge: str) -> int:
return folge.count("TTTTTT") + folge.count("HHHHHH")
print(mean(
count_streaks(''.join(
choice("HT") for _ in range(100)
))
for _ in range(10000)
))
Upvotes: 1
Reputation: 13067
Since you "know" the answer you seek is ~= 0.8:
I believe you have misinterpreted the question. I suspect that the question you really want to answer is the (in)famous one from "Automate the Boring Stuff with Python" by Al Sweigart (emphasis mine):
If you flip a coin 100 times ...
... Write a program to find out how often a streak of six heads or a streak of six tails comes up in a randomly generated list of heads and tails. Your program breaks up the experiment into two parts: the first part generates a list of randomly selected 'heads' and 'tails' values, and the second part checks if there is a streak in it. Put all of this code in a loop that repeats the experiment 10,000 times so we can find out what percentage of the coin flips (experiments) contains a streak of six heads or tails in a row.
Part 1 (generate a list of randomly selected 'heads' and 'tails' values):
observations = "".join(random.choice("HT") for _ in range(100))
Part 2 (checks if there is a streak in it.):
has_streak = observations.find("H"*6) != -1 or observations.find("T"*6) != -1
Part Do Loop (put code in a loop that repeats the experiment 10,000 times):
experimental_results = []
for _ in range(10_000):
observations = "".join(random.choice("HT") for _ in range(100))
has_streak = observations.find("H"*6) != -1 or observations.find("T"*6) != -1
experimental_results.append(has_streak)
Part Get Result (find percentage of the experiments that contain a streak):
print(sum(experimental_results)/len(experimental_results))
This should give you something close to:
0.8
Full Code:
import random
experimental_results = []
for _ in range(10_000):
observations = "".join(random.choice("HT") for _ in range(100))
has_streak = observations.find("H"*6) != -1 or observations.find("T"*6) != -1
experimental_results.append(has_streak)
print(sum(experimental_results)/len(experimental_results))
If however, the question you seek to answer is:
On average, how many occurrences of of at least 6 consecutive heads or tails there are in 100 flips of a coin?
Then we can count them up and average that like:
import random
def count_streaks(observations):
streaks = 0
streak_length = 1
prior = observations[0]
for current in observations[1:]:
if prior == current:
streak_length += 1
if streak_length == 6:
streaks += 1
else:
streak_length = 1
prior = current
return streaks
experimental_results = []
for _ in range(10_000):
observations = [random.choice("HT") for _ in range(100)]
observed_streaks = count_streaks(observations)
experimental_results.append(observed_streaks)
print(sum(experimental_results)/len(experimental_results))
This will give you a result of about:
1.50
Note:
Your code uses folge.count('TTTTTT')
. I believe this code and any answer that uses a similar strategy is likely (over the course of 10k experiments) to overestimate the answer as ("H"*12).count("H"*6)
is 2
not 1
.
For example:
This otherwise excellent answer by @samwise (Probability of streak of heads or tails in sequence of coin tossing) consistently generates results in the range of:
1.52
Upvotes: 1