d-b
d-b

Reputation: 971

My random algorithm is not random. How to improve it?

I have a 64 character long string/array with dots on all positions when I start. I want to insert 6 characters from another string in their original order but at random position replacing the dot at that position.

Also, the first character of the short string always needs to be first in the long string (that is completely non-random). So for example, if the short string is ABCDEF and (just to simplify) the long string is 10 characters long these would be valid results:

A..B.CD.EF
AB.C..DEF.

I wrote a little Java for this purpose:

    StringBuilder base = new StringBuilder("................................................................");
    int positionPrevChar = 0;       
    base.setCharAt(0, prefix.charAt(0));        
    for (int i = 1; i < prefix.length(); i++) {
        int rnd = new Random().nextInt(64 - positionPrevChar - prefix.length() + i + 1);
        rnd = rnd + positionPrevChar; //You are not allowed to place the char before previous char.
        base.setCharAt(rnd, prefix.charAt(i));
        positionPrevChar = rnd;
    }
    System.out.println(base);

The result looks like this (100 tries)

a......b..................c.....d...........................ef..
a..............................................b......c......df.
a.b...............c.............................d......e..f.....
a..................b..........................c.........d.e..f..
a............................b.................c........def.....
a..b........................................................cdef
a.................................................b........cdf..
a..........................................b..........c...de..f.
a..........................................................e.f..
a............b..............c................................df.
a....b...........................cd.............ef..............
a.........................................b.................d.ef
a..........................................................bef..
a.......................................................c.d...f.
a........................b...c..e.......................f.......
a.........................b..........................c.d.....ef.
a.............b..................c.......d....e.............f...
a.........................................................b.ef..
a...............................................b........cd.e..f
a.........................................................b.cef.
a....................................b.......c......d.....e...f.
a.............................................b...........c..def
a........................................b.....d.............e.f
a..........................................b....c.....de...f....
a...b...............................c.................d.......f.
a..............................................b.........d....f.
a...........b.......................c.................d.....ef..
a.........b.......................c.......d..............e..f...
a.................................................b..c..d..e.f..
a...................b..................................c.d....f.
a..................................b...........c........de.....f
a................................b..............c..........df...
a.....................b...............................d..e.f....
a................................................b.......c...ef.
a................................c.............d....f...........
a..............................................b......c.d...f...
a......b..................c................................d.f..
a..................................................b..c....de..f
a.............b.........c................d...............e...f..
a................................b...........d...........f......
a.......................................b............cd..e.f....
a...................................b.......c............d....f.
a.........................................b.............c.d..e.f
a.......b......................cd..........................ef...
a.b................c...........d.........e..........f...........
a......................b...c.......d............e.f.............
a.................b............c............................def.
a.................b..............c.....................d..e.f...
a.................................................b.cd......ef..
a....b.............................................d......e.f...
a.......................................................b.c.de.f
a..........................................................bcdf.
a........................................................b..e.f.
a.b....................................................c..d..f..
a....................................b............cd..........f.
a........................................................b..cef.
a............................b..............c.......d...e...f...
a......................b.............................c.d...e..f.
a..............b..........................c.............d..e...f
a..................................................c.........def
a..............................................b...c.........e.f
a..........b.................................c.............d.f..
a................................b.....c.d..e..f................
a......................................................b...c.e.f
a................b.c......................e............f........
a..............................................b...........cde.f
a............................b............c.....e....f..........
a.....................................b................c..d...ef
a............................b...d...........................f..
a.b.......................c..d............ef....................
a......b....................................c.d......e......f...
a..........b........c...............d......................f....
a....................................................b....d.ef..
b...........................c................................f..
a............................................b.........c...e...f
a.........b..............................................cd...ef
a....................................................b.....d.ef.
a....................b......c.....d.e..f........................
a..b............c........................d.....e...........f....
a.bc..................................d.........e..........f....
a........................................................b..def.
a.............................................b..........c...def
a....................b............................c.........f...
a.b.................c.............................d...........ef
a...................................b..........c......d.e....f..
a......b..................................c....d.........e.f....
a................b.........c.......................d....f.......
a.......................b...........c..d....................e..f
a..........b............................c.......d.....e.......f.
a.........................................b........c..e....f....
a.......................................................b..c.def
a........b..........c...........d.........e............f........
a.b.......c...d.............e............................f......
a...........................................b.c.......d.......ef
a..................................b.............c......d..e...f
a.........................................b.........c........ef.
a......b.............c...............d..............e.........f.
a...................b......................................d..f.
a.....................................................c......ef.
a.............................................b...c........d.f..

The problem is obvious: this distribution is not random in the way I want it. The dots that are at the end of the original string are replaced with characters far more often than the dots at the beginning (with exception for the leftmost dot, but that is on purpose).

I also do understand the problem - when creating a random position for B, I pick among 64 - 5 positions (I can't place it to the right of 59 because then then rest of the string won't fit). Say that B gets position 20, then, for the next letter, C, there are only 44 - 4 positions left. And if C gets position 15 out of these 44, then it is only 29 - 3 positions left for D.

To sum up: the positions to the left are underused and the positions to the right are overused. I would like all the usage frequency of all positions (except 0) to be approximately equal. How do I adjust my algorithm to achieve that?

Upvotes: 0

Views: 96

Answers (2)

pjs
pjs

Reputation: 19855

Since this was only tagged algorithm at the time I responded I chose to illustrate a solution in Ruby, annotated line-by-line. This implementation shuffles indices, picks off a set corresponding to how many characters there are, sorts that subset, and injects those characters into the selected random set of locations. Using shuffled indices guarantees the uniqueness of the locations.

# SETUP
char = %w(b c d e f)   # Create array of the characters whose location can vary
num_chars = char.length  # Remember how many there are
rows = 10
cols = 64
indices = (1...cols).to_a  # Create array of indices 1 to cols - 1

# GENERATE
rows.times do  # For each row you wish to generate...
  ary = Array.new(cols) {'.'}  # Create new array of periods
  ary[0] = 'a'  # All lines start with 'a' in the first location
  # Now generate a set of locations by shuffling the indices, picking
  # off the first num_chars elements, and sorting them
  location = indices.shuffle.slice(0, num_chars).sort
  # Jam the sequence of characters into the sequence of locations
  num_chars.times { |j| ary[location[j]] = char[j] }
  puts ary.join  # Print out the concatenation of the result
end

Sample output:

a.................bc...d............................e...f.......
a............b.........................c...d.......e..f.........
abc..............................................d......e...f...
a.b................c.......d.................e...........f......
a......b...........c.............d........e...f.................
a......b.........c..........d............e....f.................
a............b............c.d...........e............f..........
a............b......c........d.......e.......f..................
a...............bc................d..........e.................f
a..........b.c...............................d......e.......f...

Upvotes: 0

Matthew Hannigan
Matthew Hannigan

Reputation: 1577

You could preallocate potential positions with another character (say underscore) and shuffle.

Then replace the underscores by the letters.

Use Collections.shuffle to shuffle.

Upvotes: 3

Related Questions