xst
xst

Reputation: 3036

MATLAB histogram displays extra values

I need to generate a probability histogram of the number of rolls before a sum of 7 occurs while rolling two dice. The experiment works properly and through 10,000 iterations I get data that looks as you would expect. I am having lots of trouble in displaying this data in a histogram however. The problem is that there is a large amount of extra data that seems to be printed onto the histogram that is not present in the vector that I've passed to hist(). This shows up as a large amount of infinitely large bins at large values on the x-axis.

Since the probability of rolling a sum of 7 is 6/36 = 1/6, it is typical for this to occur on one of the first few rolls. Here i have a row vector "rollbins", where the ith entry holds the frequency of the experiment requiring "i" rolls. After many iterations of the experiment, rollbins has its first few elements large with each subsequent entry smaller until the 45th is usually zero.

I've used the hist() function with a bins vector argument and per this question I've used xlim() to limit the display to only 0-45 on the x-axis. However the output is not limited with or without xlim().

Any help is greatly appreciated :)

iters = 1000;
% do not consider extreme results
maxrolls = 45;
% rollbins(i) is how many experiments occured with i rolls
rollbins = zeros(1, maxrolls);

for r=1 : 1 : iters
    % roll die until get sum of 7, note times taken
    sum = 0;
    % the amount of rolls the experiment takes
    rolls = 0;
    while sum ~= 7
        rolls = rolls + 1;
        % sum two rolls of a die (same as one roll two dies)
        sum = floor( 6*rand(1) + 1 ) + floor( 6*rand(1) + 1 );
    end

    % assign if within the vector's limits; discards outliers
    if rolls < maxrolls
        rollbins(rolls) = rollbins(rolls) + 1;
    end
end

% 1,2,3...45
range = 1:1:maxrolls;
% limit the values on x-axis to 0-45
xlim([0 maxrolls]);
% the histogram shows more than 45 vertical bars
hist(rollbins, range)

edit: the xlim() call should come after the hist() function. Leaving the semi-colon off of the last graphics function (ylim) enables these effects to take place.

hist(rollbins, range);
xlim([0 maxrolls-1]);
ylim([0 iters / 5])

However I now realize that the bars are much too short still and the bins appear in intervals of .1 not 1 as I'd expected.

Upvotes: 0

Views: 1883

Answers (3)

xst
xst

Reputation: 3036

this was the solution i ended up with (i'm not too familiar with vectorizing quite yet)

iters = 10000;
% preallocation of experiments row vector, one element for every experiment
experiments = zeros(1,iters);
for i=1 : 1 : iters
    % roll die until get sum of 7, note times taken
    sum = 0;
    rolls = 0;
    while sum ~= 7
        rolls = rolls + 1;
        sum = floor(6*rand(1)+1) + floor(6*rand(1)+1);
    end

    % save the number of rolls this experiment took
    experiments(i) = rolls;
end

% do not plot experiments that took more than 50 rolls
bins = 0:1:50;
hist(experiments, bins);
xlim([0 50]);
ylim([0 1750])

Upvotes: 0

Amro
Amro

Reputation: 124563

Here is how I would implement this simulation:

iters = 1000;               %# number of times to run simulation
maxrolls = 45;              %# max number of rolls to consider
numRolls = nan(iters,1);    %# store number of rolls in each run
for r=1:iters
    %# rolls dice "maxrolls"-times, and compute the sums
    diceSums = sum(randi([1 6],[maxrolls 2]), 2);

    %# find the first occurence of a sum of 7
    ind = find(diceSums==7, 1, 'first');

    %# record it if found (otherwise noted as NaN)
    if ~isempty(ind)
        numRolls(r) = ind;
    end
end

%# compute frequency of number of rolls, and show histogram
counts = histc(numRolls, 1:maxrolls);
bar(1:maxrolls, counts, 'histc'), grid on
xlabel('Number of dice rolls to get a sum of 7')
ylabel('Frequency')
xlim([1 maxrolls])

screenshot

If you're feeling a bit adventurous, here is a fully vectorized version of the big loop:

numRolls = cellfun(@(v) find(v,1,'first'), ...
    num2cell(sum(randi([1 6],[iters maxrolls 2]),3) == 7, 2), ...
    'UniformOutput',false);
numRolls(cellfun(@isempty,numRolls)) = {NaN};
numRolls = cell2mat(numRolls);

Upvotes: 0

grantnz
grantnz

Reputation: 7423

You are recording the frequency of the roll count but you should be just recording the roll count itself and then letting hist show the frequency in a histogram.

Also, you would need to apply xlim after generating the histogram (not before).

rollbins = zeros(1, maxrolls);
numberofrolls = [];   % Initialise numberofrolls

and

if rolls < maxrolls
    rollbins(rolls) = rollbins(rolls) + 1;
    numberofrolls (end+1) = rolls;  % Record # of rolls
end

with

hist(numberofrolls);    % Generate histogram

Upvotes: 1

Related Questions