Reputation: 89
I need help to output a random text.
I've been given a text with 25k words, from this text_file
I've been calculating the probability for the single letters, and the single words to see which letters/words have been used most.
Now I need to make a other text with 500 letters, but this text should include the probability that I have calculated, and should be wroten by the letters that I "found" from the first text.
It's like: Text1 -> do probability over the usen letters, which letters have been represented most. Make text2 -> use the probability u found from text1.
Hope u can help me, Im new in Python.
Upvotes: 2
Views: 7050
Reputation: 3565
The easiest thing is to randomly select letters of the 25k file. Then the resultant has the same probability as the original.
import random
print(''.join(random.choice(original_text) for _ in range(500)))
Upvotes: 4
Reputation: 89
I've now this:
def random_text():
return(''.join(random.choice(text) for _ in range(500)))
random_letters = []
for i in range(1):
random_letter = random_text()
random_letters.append(random_letter)
print random_letters
Now it only runs once. But I don't know how to make the output text onto encoding utf-8?
Upvotes: 0
Reputation: 358
You could do something like this:
import string
import random
def get_random_letter():
# depends how you want to randomize getting your letter
return random.choice(string.letters)
random_letters = []
for i in range(500):
random_letter = get_random_letter()
random_letters.append(random_letter)
with open("text.txt", 'w') as f:
f.write("".join(random_letters))
You would change the "get_random_letter" definition depending on your probability model and return that character (in that case, you do not need to import random or string, these are just used for example).
Edit: To get the letter based on a certain weight you could use this:
import random
inputs = ['e', 'f', 'g', 'h']
weights = [10, 30, 50, 10]
def get_random_letter(inputs, weights):
r = random.uniform(0, sum(weights))
current_cutoff = 0
for index in range(len(weights)):
current_cutoff = current_cutoff + weights[index]
if r < current_cutoff:
return inputs[index]
print get_random_letter(inputs, weights)
which is derived from the post here: Returning a value at random based on a probability weights
Upvotes: 0