Reputation: 1525
For word games, it is often the frequency of letters in English vocabulary, regardless of word frequency, which is of more interest.
1. >E 11.1607% 56.88 M 3.0129% 15.36
2. A 8.4966% 43.31 H 3.0034% 15.31
3. R 7.5809% 38.64 G 2.4705% 12.59
4. I 7.5448% 38.45 B 2.0720% 10.56
5. O 7.1635% 36.51 F 1.8121% 9.24
6. T 6.9509% 35.43 Y 1.7779% 9.06
7. N 6.6544% 33.92 W 1.2899% 6.57
8. S 5.7351% 29.23 K 1.1016% 5.61
9. L 5.4893% 27.98 V 1.0074% 5.13
10. C 4.5388% 23.13 X 0.2902% 1.48
11. U 3.6308% 18.51 Z 0.2722% 1.39
12. D 3.3844% 17.25 J 0.1965% 1.00
13. P 3.1671% 16.14 Q 0.1962% (1) <-
The third column represents proportions, taking the least common letter (q) as equal to 1. The letter E is over 56 times more common than Q in forming individual English words.
How is it possible to build an algorithm is javascript such that if I generate say 100 letters, 11-12% of then i.e. 11-12 letters will be E and so on.
Upvotes: 1
Views: 328
Reputation: 664484
Here's an algorithm:
Split the range [0, 1)
in intervals that each matches a letter and has a size proportional to its probability. Eg
0 - 0.116: E 0.116 - 0.201: A …
Upvotes: 4