AWE
AWE

Reputation: 4135

The pythonic way of printing a value

This probably measures how pythonic you are. I'm playing around trying to learn python so Im not close to being pythonic enough. The infile is a dummy patriline and I want a list of father son.

infile:

haffi jolli dkkdk lkskkk lkslll sdkjl kljdsfl klsdlj sdklja asldjkl

code:

def main():
    infile = open('C:\Users\Notandi\Desktop\patriline.txt', 'r')
    line = infile.readline()               
    tmpstr = line.split('\t')
    for i in tmpstr[::2]:
        print i, '\t', i + 1
    infile.close()
main()

The issue is i + 1; I want to print out two strings in every line. Is this clear?

Upvotes: 3

Views: 286

Answers (4)

stein
stein

Reputation: 180

I'd use the with statement here, which if you're using an older version of python you need to import:

from __future__ import with_statement

for the actual code, if you can afford to load the whole file into memory twice (ie, it's pretty small) I would do this:

def main():
    with open('C:\Users\Notandi\Desktop\patriline.txt', 'r') as f:
        strings = f.read().split('\t')
    for father, son in zip(string, string[1:]):
        print "%s \t %s" % (father, son)
main()

That way you skip the last line with out having too much overhead to not include the childless leaf at the end, which is think is what you were asking for(?)

As a bit of a tangent: if the file is really big, you may not want to load the whole thing into memory, in which case you may need a generator. You probably don't need to do this if you're actually printing everything out, but in case this is some simplified version of the problem, this is how I would approach making a generator to split the file:

class reader_and_split():
    def __init__(self, fname, delim='\t'):
        self.fname = fname
        self.delim = delim
    def __enter__(self):
        self.file = open(self.fname, 'r')
        return self.word_generator()
    def __exit__(self, type, value, traceback):
        self.file.close()
    def word_generator(self):
        current = []
        while True:
            char = self.file.read(1)
            if char == self.delim:
                yield ''.join(current)
                current = []
            elif not char:
                break
            else:
                current.append(char)

The value of a generator is that you don't load the entire contents of the file into memory, before running the split on it, which can be expensive for very, very large files. This implementation only allows single character delimiter for simplicity. Which means all you need to do to parse out everything is to use the generator, a quick dirty way to do this is:

with reader_and_split(fileloc) as f:
    previous = f.next()
    for word in f:
        print "%s \t %s" % (previous, word)
        previous = word

Upvotes: 2

rxmnnxfpvg
rxmnnxfpvg

Reputation: 30993

You can be more pythonic in both your file reading and printing. Try this:

def main():
    with open('C:\Users\Notandi\Desktop\patriline.txt', 'r') as f:
        strings = f.readline().split('\t')
    for i, word in enumerate(strings):
        print "{} \t {}".format(word, strings[i+1:i+2])
main()

Using strings[i+1:i+2] ensures an IndexError isn't thrown (instead, returning a []) when trying to reach the i+1th index at the end of the list.

Upvotes: 1

Mark Ransom
Mark Ransom

Reputation: 308206

Here's one clean way to do it. It has the benefit of not crashing when fed an odd number of items, but of course you may prefer an exception for that case.

def main():
    with open('C:\Users\Notandi\Desktop\patriline.txt', 'r') as infile:
        line = infile.readline()
        previous = None
        for i in line.split('\t'):
            if previous is None:
                previous = i
            else:
                print previous, '\t', i
                previous = None

I won't make any claims that this is pythonic though.

Upvotes: 0

Katriel
Katriel

Reputation: 123662

You are getting confused between the words in the split string and their indices. For example, the first word is "haffi" but the first index is 0.

To iterate over both the indices and their corresponding words, use enumerate:

for i, word in enumerate(tmpstr):
    print word, tmpstr[i+1]

Of course, this looks messy. A better way is to just iterate over pairs of strings. There are many ways to do this; here's one.

def pairs(it):
    it = iter(it)
    for element in it:
        yield element, next(it)

for word1, word2 in pairs(tmpstr):
    print word1, word2

Upvotes: 6

Related Questions