Artur Gajowy
Artur Gajowy

Reputation: 2038

Python: C++-like stream input

Is there a pythonic way of reading - say - mixed integer and char input without reading the whole input at once and without worrying about linebreaks? For example I have a file with whitespace-separated data of which I only know that there are x integers, then y chars and then z more integers. I don't want to assume anything about linebreaks.

I mean something as mindless as the following in C++:

...

int i, buf;
char cbuf;
vector<int> X, Z;
vector<int> Y;

for (i = 0; i < x; i++) {
    cin >> buf;
    X.push_back(buf);
}

for (i = 0; i < y; i++) {
    cin >> cbuf;
    Y.push_back(cbuf);
}

for (i = 0; i < z; i++) {
    cin >> buf;
    Z.push_back(buf);
}

EDIT: i forgot to say that I'd like it to behave well under live input from console as well - i.e. there should be no need to press ctrl+d before getting tokens and the function should be able to return them as soon as a line has been entered. :)

Upvotes: 2

Views: 3786

Answers (4)

S.Lott
S.Lott

Reputation: 391972

How's this? Building on heikogerlach's excellent read_tokens.

def read_tokens(f):
   for line in f:
       for token in line.split():
           yield token

We can do things like the following to pick up 6 numbers, 7 characters and 6 numbers.

fi = read_tokens(data)
x= [ int(fi.next()) for i in xrange(6) ]
y= [ fi.next() for i in xrange(7) ]
z= [ int(fi.next()) for i in xrange(6) ]

Upvotes: 0

unbeknown
unbeknown

Reputation:

How about a small generator function that returns a stream of tokens and behaves like cin:

def read_tokens(f):
   for line in f:
       for token in line.split():
           yield token

x = y = z = 5  # for simplicity: 5 ints, 5 char tokens, 5 ints
f = open('data.txt', 'r')
tokens = read_tokens(f)
X = []
for i in xrange(x):
    X.append(int(tokens.next()))
Y = []
for i in xrange(y):
    Y.append(tokens.next())
Z = []
for i in xrange(z):
    Z.append(int(tokens.next()))

Upvotes: 7

Autoplectic
Autoplectic

Reputation: 7666

if you don't want to read in a whole line at a time, you might want to try something like this:

def read_tokens(file):
    while True:
        token = []
        while True:
            c = file.read(1)
            if c not in ['', ' ', '\t', '\n']:
                token.append(c)
            elif c in [' ', '\t', '\n']:
                yield ''.join(token)
                break
            elif c == '':
                yield ''.join(token)
                raise StopIteration

that should generate each whitespace-delimited token in the file reading one character at a time. from there you should be able to convert them to whatever type they should be. the whitespace can probably be taken care of better, too.

Upvotes: 3

S.Lott
S.Lott

Reputation: 391972

Like this?

>>> data = "1 2 3 4 5 6 abcdefg 9 8 7 6 5 4 3"

For example, we might get this with data= someFile.read()

>>> fields= data.split()
>>> x= map(int,fields[:6])
>>> y= fields[6]
>>> z= map(int,fields[7:])

Results

>>> x
[1, 2, 3, 4, 5, 6]
>>> y
'abcdefg'
>>> z
[9, 8, 7, 6, 5, 4, 3]

Upvotes: 2

Related Questions