KOB
KOB

Reputation: 4565

Plotting two sets of data on one histogram

I have just began using Octave and am attempting to simulate 10,000 outcomes of a Binomial Random Variable defined as:

X ~ Bi(5, 0.2)

I have graphed the result with the following function:

function x = generate_binomial_bernoulli(n,p,m)
  % generate Bi(n, p) outcomes m times

  x = zeros(1,m);       % allocate array for m simulations
  for i = 1:m           % iterate over m simulations
    successes = 0;      % count the number of successful trials per simualtion (0-5)
    for j = 1:n         % iterate through the n trials
      u = rand;         % generate random nuumber from 0-1
      if (u <= p)       % if random number is <= p
        successes++;    % count it as a success
      endif
    end
    x(i) = successes;   % store the number of successful trials in this simulation
  end

  alphabet_x=[0:n];     % create an array from 0 to n        
  hist(x,alphabet_x);   % plot graph     

end

I then call the function with generate_binomial_bernoulli(5, 0.2, 10000).

This is simulating 5 Bernoulli Trials, each with a probability of success of 0.2, repeating the 5 trials 10,000 times and graphing the distribution of the number of successes. This plot shows the empirical results of the simulation.

enter image description here

I have now also been asked to plot the theoretical results, which my best guess would be a normally distributed plot around 1 success on the x-axis (0.2 * 5 = 1).

  1. How could I create this plot, and display it on the same histogram?
  2. How could I correclty display my graph with the x-axis spanning just from 0 to 5, both axes labelled, and the two histograms colour-coded with a legend?

EDIT

Here is my current function where I have tried to plot the normalized / theoretical curve:

function x = generate_binomial_bernoulli(n,p,m)
  % generate Bi(n, p) outcomes m times

  emperical = zeros(1,m);             % allocate array for m simulations
  for i = 1:m                         % iterate over m simulations
    successes = 0;                    % count the number of successful trials per simualtion (0-5)
    for j = 1:n                       % iterate through the n trials
      u = rand;                       % generate random nuumber from 0-1
      if (u <= p)                     % if random number is <= p
        successes++;                  % count it as a success
      endif
    end
    emperical(i) = successes;         % store the number of successful trials in this simulation
  end

  close all;                          % close any existing graphs

  x_values = [0:n];                   % array of x-axis values        
  hist(emperical, x_values, "facecolor", "r"); % plot empirical data
  xlim([-0.5 (n + 0.5)]);             % set x-axis to allow for histogram bar widths

  hold on;                            % hold current graph

  mean = n * p;                       % theoretical mean
  norm = normpdf(x_values, mean, 1);  % normalised y values
  plot(x_values, norm, "color", "b"); % plot theoretical distribution

  legend('Emprical', 'Theoretical');   

end

As shown below, this curve only extends to a very low height along the y-axis, but I do not know how to span it across the entire dataset.

enter image description here

Upvotes: 0

Views: 557

Answers (1)

Yvon
Yvon

Reputation: 2993

Get histogram number counts and bins:

[counts,centers] = hist(x);

Normalize:

freq = counts/sum(counts);

Draw normalized histogram:

bar(centers,freq)

Upvotes: 0

Related Questions