Reputation: 2038
Is there a pythonic way of reading - say - mixed integer and char input without reading the whole input at once and without worrying about linebreaks? For example I have a file with whitespace-separated data of which I only know that there are x integers, then y chars and then z more integers. I don't want to assume anything about linebreaks.
I mean something as mindless as the following in C++:
...
int i, buf;
char cbuf;
vector<int> X, Z;
vector<int> Y;
for (i = 0; i < x; i++) {
cin >> buf;
X.push_back(buf);
}
for (i = 0; i < y; i++) {
cin >> cbuf;
Y.push_back(cbuf);
}
for (i = 0; i < z; i++) {
cin >> buf;
Z.push_back(buf);
}
EDIT: i forgot to say that I'd like it to behave well under live input from console as well - i.e. there should be no need to press ctrl+d before getting tokens and the function should be able to return them as soon as a line has been entered. :)
Upvotes: 2
Views: 3786
Reputation: 391972
How's this? Building on heikogerlach's excellent read_tokens
.
def read_tokens(f):
for line in f:
for token in line.split():
yield token
We can do things like the following to pick up 6 numbers, 7 characters and 6 numbers.
fi = read_tokens(data)
x= [ int(fi.next()) for i in xrange(6) ]
y= [ fi.next() for i in xrange(7) ]
z= [ int(fi.next()) for i in xrange(6) ]
Upvotes: 0
Reputation:
How about a small generator function that returns a stream of tokens and behaves like cin
:
def read_tokens(f):
for line in f:
for token in line.split():
yield token
x = y = z = 5 # for simplicity: 5 ints, 5 char tokens, 5 ints
f = open('data.txt', 'r')
tokens = read_tokens(f)
X = []
for i in xrange(x):
X.append(int(tokens.next()))
Y = []
for i in xrange(y):
Y.append(tokens.next())
Z = []
for i in xrange(z):
Z.append(int(tokens.next()))
Upvotes: 7
Reputation: 7666
if you don't want to read in a whole line at a time, you might want to try something like this:
def read_tokens(file):
while True:
token = []
while True:
c = file.read(1)
if c not in ['', ' ', '\t', '\n']:
token.append(c)
elif c in [' ', '\t', '\n']:
yield ''.join(token)
break
elif c == '':
yield ''.join(token)
raise StopIteration
that should generate each whitespace-delimited token in the file reading one character at a time. from there you should be able to convert them to whatever type they should be. the whitespace can probably be taken care of better, too.
Upvotes: 3
Reputation: 391972
Like this?
>>> data = "1 2 3 4 5 6 abcdefg 9 8 7 6 5 4 3"
For example, we might get this with data= someFile.read()
>>> fields= data.split()
>>> x= map(int,fields[:6])
>>> y= fields[6]
>>> z= map(int,fields[7:])
Results
>>> x
[1, 2, 3, 4, 5, 6]
>>> y
'abcdefg'
>>> z
[9, 8, 7, 6, 5, 4, 3]
Upvotes: 2