Reputation: 401
Hi im newbie with python and i want to read a file by strings. The file has the following structure:
semilla
n_galleria t_espera t_llegada
p_ticket t_servicio
n_colosso min_colosso max_colosso
n_prisionero m_prisionero miu_prisionero sigma_prisionero
n_david
p_decision n_orcagna miu_orcagna sigma_orcagna
n_libreria p_libreria min_libreria max_libreria
until now i just have this:
f = open("/tmp/entrada.txt")
g = open("/tmp/salida.txt", "w+")
for linea in f.readlines():
line = linea.split(' ')
f.close()
g.close()
btw, every field at the file structure is the name of the variable. I mean, first, want to save a variable called "semilla" with the value that will be there at entrada.txt
Upvotes: 0
Views: 140
Reputation: 1268
This is a naive solution.. but easy to follow:
tokens = []
for linea in f.readlines():
line_content = linea.split(' ')
for token in line_content:
tokens.append(token)
print(tokens)
Upvotes: 1
Reputation: 1422
For word tokenization, it is best to use the nltk
module which will handle word separators of any kind. So you can do this:
import nltk
f = open("/tmp/entrada.txt").read()
# return the list of words
words = nltk.word_tokenize(f)
And this should be more robust for any kind of text that you have.
Upvotes: 0
Reputation: 5059
If by "word" you mean you want to be able to iterate through every single substring delimited from the rest of the text by two spaces, you can do this:
for word in f.read().split(' '):
do_something_to_string
There's no need to read the file by line if you don't actually need to parse it by line.
Upvotes: 0