Reputation: 10759
For testing data, I am in need of quickly creating large files of random text. I have one solution, taken from here and given below:
import random
import string
n = 1024 ** 2 # 1 Mb of text
chars = ''.join([random.choice(string.letters) for i in range(n)])
with open('textfile.txt', 'w+') as f:
f.write(chars)
My problem is that this takes 653 ms to perform, way too much for my uses.
Is there a faster way to quickly generate text files with random text?
Upvotes: 2
Views: 8126
Reputation: 402463
Create a numpy array of letters:
In [662]: letters = np.array(list(chr(ord('a') + i) for i in range(26))); letters
Out[662]:
array(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'],
dtype='<U1')
Use np.random.choice
to generate random indices b/w 0 and 26, and index letters
to generate random text:
np.random.choice(letters, n)
In [664]: n = 1024 ** 2
In [701]: %timeit np.random.choice(letters, n)
100 loops, best of 3: 15.1 ms per loop
Alternatively,
In [705]: %timeit np.random.choice(np.fromstring(letters, dtype='<U1'), n)
100 loops, best of 3: 14.1 ms per loop
Upvotes: 2