verbatim64x
verbatim64x

Reputation: 51

Random text generator

What is the best way to generate random a string which is composed of alphabets and its a maximum of 8million characters which will be tested using string searching algorithms? is Math.random still be ok for the randomness or the reliability of the spread of characters based on statistics? any comment is appreciated, right me if im wrong with my ideas.

Upvotes: 5

Views: 12675

Answers (4)

dlopezgonzalez
dlopezgonzalez

Reputation: 4297

This class of commons-lang library does that job

org.apache.commons.lang.RandomStringUtils

You can use method "random"

String s = org.apache.commons.lang.RandomStringUtils.random(5, true, false);

Upvotes: 0

Adamski
Adamski

Reputation: 54725

It depends entirely on the purpose of generating this string. If you're generating strings in order to test the performance of a search algorithm then you may want to generate "English-like" text containing a distribution of words similar to a typical document.

One way to achieve this would be to build a Markov Chain, whereby for each state you generate a given word; e.g. "The" and then transition to a new state with a certain probability; e.g. "The" -> "first". You could auto-generate the Markov chain using a large body of sample text, such as the Brown Corpus.

Or even simpler, you could test your algorithm using a particular corpus (such as the Brown Corpus) rather than having to generate any samples yourself.

Upvotes: 1

Joey
Joey

Reputation: 354864

Sure, why not? 8 MiB isn't that much, actually. Even bad PRNGs have periods at least of a few billion and Java uses an 48-bit LCG. So yes, it should be ok.

Upvotes: 1

Related Questions