amit
amit

Reputation: 15

Can a random string be generated with the frequency of characters in it, fixed?

I am trying to generate random strings, with the number of occurence of each character,fixed. For example- G (a character in the string),cannot appear more than 4 times , S, no more than 2 times and so on. My code is able to generate the strings but not the problem with the frequency of characters.

Is there any way, to fix this thing ? Any help will really be appreciated.

My current code is as follows:

    #include<stdio.h>
    #include<stdlib.h>

    //The basic function here is like (rand() % range)+ Lower_bound
    int rand_int(int a,int b)
    {    
    if (a > b)  

        return((rand() % (a-b+1)) + b);    
    else if (b > a)        
        return((rand() % (b-a+1)) + a);    
    else        
       return a;
   } 
     int main()
 { 

   FILE *fptr;
   fptr=fopen("C:\\Users\\Desktop\\dataset.txt","w");    
   int a,b,r,i,j,k;    
   char res;    
    a = 65;    
    b = 90;    
    k = 0;    
    char str[] =     "ITLDGGCSSHLPLRCSVDSGCPGLRAGSVSCLPHGSIREGMECSRRHGVGIMGDRRDGSRDS";  //working string     
   char str_all[] = "ITLDGGCSSHLPLRCSVDSGCPGLRAGSVSCLPHGSIREGMECSRRHGVGIMGDRRDGSRDS";    
   char subs[] = "ACDEGHILMPRSTV";    //
   fprintf(fptr,"%s",str);
   fprintf(fptr,"%s","\n");

// value of j represents the number of strings to be displayed    
for(j=0 ; j<10000 ; j++)    
{      
     // getting all the sequence strings      
     // for changing the string at all positions      
      for(i=0 ; i<62 ; i++)      
      {            
         r = rand_int(a,b);        
         res = (char)r;        
         str_all[i] = res;      
      }

   // substituting all the not required characters with required ones      
      for(i=0 ; i<62 ; i++)      
      {      
         if(str_all[i] == 'B' || str_all[i] == 'F' || str_all[i] == 'J' || str_all[i] == 'K' || str_all[i] == 'N' || str_all[i] == 'O' || str_all[i] == 'Q' || str_all[i] == 'U' || str_all[i] == 'W' || str_all[i] == 'X' || str_all[i] == 'Y' || str_all[i] == 'Z')
         {      
           str_all[i] = subs[k++];      
         }      
         if(k>13)        
            k=0;   
      } 
      //printf("\nModified String for all string positions \n%s",str_all);  
        fprintf(fptr,"%s",str_all);
         fprintf(fptr,"%s","\n");
   }  
fclose(fptr);  
return 0;
}

Upvotes: 0

Views: 347

Answers (2)

JeremyP
JeremyP

Reputation: 86651

You can use the Fisher Yates shuffle. Put all your allowed characters into an array. Let's say you are allowed only 4 G's and 2 S's and nothing else. Your array would start out like this:

char array[] = { 'G', 'G', 'G', 'G', 'S', 'S' };

Then you apply the Fisher Yates shuffle. The Fisher Yates shuffle picks a random element from the array and swaps it with the last element. It then picks a random element from all but the last element and swaps it with the last but one element. It then picks a random element from all but the last two elements and swaps it with the last but two element and so on until you would be selecting a "random" element from only the first element.

Something like this:

for (int i = 0 ; i < N_ELEMENTS - 1 ; i++)
{
    int ceiling = N_ELEMENTS - i;
    int choice = arc4random_uniform(ceiling);
    char tmp = array[ceiling - 1];
    array[ceiling - 1] = array[choice];
    array[choice] = tmp;
}

Use the above at your own risk since it has undergone no testing or even compiling.

I use arc4random_uniform because simply taking the modulus of a random number to get a random number in a smaller range skews the result. arc4random_uniform does not suffer from that issue.

Upvotes: 1

John Zwinck
John Zwinck

Reputation: 249093

You can simply create a fixed (constant) string with the available characters, and then shuffle a copy of it each time you need a random string. For how to shuffle an array in C, see here: Shuffle array in C

Upvotes: 2

Related Questions