Stev0
Stev0

Reputation: 605

Python Obtain count of lines in multiples of 64

Didn't quite understand how to do this enough to Google properly. I'm trying to iterate over a list which contains lines from an input file. I am keeping track of the line number for each line for error logging purposes.

I would like to write the results of my loop to an output file. I have placed the newline character to my list.append function call and it works great to determine if there is something wrong with one of the lines in the file. After each iteration it writes to a newline.

In blocks of 64 i would like to then write two newline characters to they are distinguishable in the output file. Here is what I have so far.

import sys

fname = sys.argv[1]
list = []
output = "hashes.txt"
with open(fname) as f:
    content = f.readlines()
    num_line = 0
    for line in content:
        if line:    
            num_line += 1
            line = line.split(',')
            try:
                //if num_line == 64??? Not Sure how to iterate in blocks of 64\\
                list.append(line[1] + "\n\n")
            except Exception, ex:
                print("Problem on line", line, num_line)

with open(output, 'w') as w:
    w.writelines(list)

Upvotes: 0

Views: 67

Answers (3)

Cyphase
Cyphase

Reputation: 12022

This script should do the same thing as what you want yours to do, except it's a lot cleaner IMHO:

import sys

from itertools import zip_longest


# From itertools recipes:
# https://docs.python.org/3/library/itertools.html
def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)


def main(outfile_path, infile_path, group_size):
    with open(infile_path) as infile, open(outfile_path) as outfile:
        # Filter out lines with zero non-whitespace characters
        nonempty_lines = (line for line in infile if line.strip())

        # Filter out lines that don't have a second value
        splittable_lines = (line for line in nonempty_lines if ',' in line)

        # Get second values from lines that have one
        all_values = (line.split(',')[1] for line in splittable_lines)

        # Filter out empty second values
        nonempty_values = (value for value in all_values if values)

        # Create output lines
        output_lines = ('{}\n'.format(value) for value in nonempty_values)

        for group_of_output_lines in grouper(output_lines, group_size):
            outfile.writelines(group_of_output_lines)
            outfile.write('\n')


if __name__ == '__main__':
    main(outfile_path='hashes.txt', infile_path=sys.argv[1], group_size=64)

grouper() is a generator that will yield tuples containing groups of n items from iterable, which we use to group by 64 items.

main() is pretty well-commented, so I won't explain it here unless someone finds something to be unclear.

Upvotes: 0

Burhan Khalid
Burhan Khalid

Reputation: 174708

Unless you are going to process the lines later, you can read and write at the same time, without having to store the lines. Also list is a poor choice for a variable name as its the name of a built-in method list().

You also have a try/catch that won't raise any exceptions, try this version of your code:

import sys

fname = sys.argv[1]
# list = [] -- not needed
output = "hashes.txt"
with open(fname) as f, open(output, 'w') as out:
    num_line = 0
    for line in f:
        if line.strip():    
            num_line += 1
            bits = line.strip().split(',')
            try:
                output_line = bits[1]
            except IndexError:
                print("Problem on line", line, num_line)
                continue # skip the rest of the loop,
                         # go to the next line
            if not num_line % 64:
                out.write('{}\n\n'.format(output_line))
            else:
                out.write('{}\n'.format(output_line))    

Upvotes: 0

TigerhawkT3
TigerhawkT3

Reputation: 49330

In this line:

//if num_line == 64??? Not Sure how to iterate in blocks of 64\\

You are looking for this:

if not num_line % 64:

When the remainder of the line number divided by 64 is zero, it will go into that if block.

Oh, and you want # for Python comments, not //.

And as Cyphase mentioned, you'll want if line.strip(): instead of just if line:, because linefeeds count as a character.

Upvotes: 2

Related Questions