Reputation: 605
Didn't quite understand how to do this enough to Google properly. I'm trying to iterate over a list which contains lines from an input file. I am keeping track of the line number for each line for error logging purposes.
I would like to write the results of my loop to an output file. I have placed the newline character to my list.append function call and it works great to determine if there is something wrong with one of the lines in the file. After each iteration it writes to a newline.
In blocks of 64 i would like to then write two newline characters to they are distinguishable in the output file. Here is what I have so far.
import sys
fname = sys.argv[1]
list = []
output = "hashes.txt"
with open(fname) as f:
content = f.readlines()
num_line = 0
for line in content:
if line:
num_line += 1
line = line.split(',')
try:
//if num_line == 64??? Not Sure how to iterate in blocks of 64\\
list.append(line[1] + "\n\n")
except Exception, ex:
print("Problem on line", line, num_line)
with open(output, 'w') as w:
w.writelines(list)
Upvotes: 0
Views: 67
Reputation: 12022
This script should do the same thing as what you want yours to do, except it's a lot cleaner IMHO:
import sys
from itertools import zip_longest
# From itertools recipes:
# https://docs.python.org/3/library/itertools.html
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
def main(outfile_path, infile_path, group_size):
with open(infile_path) as infile, open(outfile_path) as outfile:
# Filter out lines with zero non-whitespace characters
nonempty_lines = (line for line in infile if line.strip())
# Filter out lines that don't have a second value
splittable_lines = (line for line in nonempty_lines if ',' in line)
# Get second values from lines that have one
all_values = (line.split(',')[1] for line in splittable_lines)
# Filter out empty second values
nonempty_values = (value for value in all_values if values)
# Create output lines
output_lines = ('{}\n'.format(value) for value in nonempty_values)
for group_of_output_lines in grouper(output_lines, group_size):
outfile.writelines(group_of_output_lines)
outfile.write('\n')
if __name__ == '__main__':
main(outfile_path='hashes.txt', infile_path=sys.argv[1], group_size=64)
grouper()
is a generator that will yield tuples containing groups of n
items from iterable
, which we use to group by 64 items.
main()
is pretty well-commented, so I won't explain it here unless someone finds something to be unclear.
Upvotes: 0
Reputation: 174708
Unless you are going to process the lines later, you can read and write at the same time, without having to store the lines. Also list
is a poor choice for a variable name as its the name of a built-in method list()
.
You also have a try/catch that won't raise any exceptions, try this version of your code:
import sys
fname = sys.argv[1]
# list = [] -- not needed
output = "hashes.txt"
with open(fname) as f, open(output, 'w') as out:
num_line = 0
for line in f:
if line.strip():
num_line += 1
bits = line.strip().split(',')
try:
output_line = bits[1]
except IndexError:
print("Problem on line", line, num_line)
continue # skip the rest of the loop,
# go to the next line
if not num_line % 64:
out.write('{}\n\n'.format(output_line))
else:
out.write('{}\n'.format(output_line))
Upvotes: 0
Reputation: 49330
In this line:
//if num_line == 64??? Not Sure how to iterate in blocks of 64\\
You are looking for this:
if not num_line % 64:
When the remainder of the line number divided by 64 is zero, it will go into that if
block.
Oh, and you want #
for Python comments, not //
.
And as Cyphase mentioned, you'll want if line.strip():
instead of just if line:
, because linefeeds count as a character.
Upvotes: 2