Reputation: 376
I am trying to write a program that (among other things), prints an input file ('table1.txt') into the format seen below.
Where diff is the difference between the 3rd and 4th value of each line.
I think I have the basic idea figured out and I tried doing:
f = open("table1.txt",'r')
for aline in f:
values = aline.split(',')
print('Team:',values[0],', Points:',values[1],', Diff:',values[2]-values[3],'Goals:',values[2])
f.close()
But it results in a operand type error. I think I just have to change the way I iterate over the items in the file but I do not know how.
Upvotes: 2
Views: 34076
Reputation: 54223
You should definitely use the csv
module for this. It allows you to iterate through rows and values (essentially building a fancy "list of lists"). csv
also has a DictWriter
object that would work well to spit this data into a file, but actually displaying it is a little different. Let's look at building the csv first.
import csv
import operator
with open('path/to/file.txt') as inf,
open('path/to/output.csv', 'wb') as outf:
reader = sorted(csv.reader(inf), key=operator.itemgetter(1)
# sort the original data by the `points` column
header = ['Team', 'Points', 'Diff', 'Goals']
writer = csv.DictWriter(outf, fieldnames=header)
writer.writeheader() # writes in the fieldnames
for row in reader:
if not len(row) == 4:
break # This is probably not a useful row
teamname, points, home_g, away_g = row
writer.writerow({'Team': teamname,
'Points': points,
'Diff': home_g - away_g,
'Goals': "{:>2} : {:2}".format(home_g, away_g)
})
This should give you a csv file (at path/to/output.csv
) that has the data in the format you requested. At this point it's pretty easy to just pull the data and run print
statements to display it. We can use string templating to do this nicely.
import itertools
row_template = """\
{{0:{idx_length}}}{{<1:{teamname_length}}}{{>2:{point_length}}}{{>3:{diff_length}}}{{=4:{goals_length}}}"""
with open('path/to/output.csv') as inf: # same filename we used before
reader = csv.reader(inf) # no need to sort it this time!
pre_process, reader = itertools.tee(reader)
# we need to get max lengths for each column to build our table, so
# we will need to iterate through twice!
columns = zip(*pre_process) # this is magic
col_widths = {k: len(max(col, key=len)) for k,col in zip(
['teamname_length', 'point_length', 'diff_length', 'goals_length'],
columns)}
It's worth stopping here to look at this magic. I won't go too far into the columns = zip(*pre_process)
magic idiom, other than noting that it turns rows of columns into columns of rows. In other words
zip(*[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
becomes
[[1, 4, 7],
[2, 5, 8],
[3, 6, 9]]
after that we're just using a dictionary comprehension to build {'team_length': value, 'point_length': ...}
etc that we can feed into our templater to make the field widths the right size.
We also need idx_length
in that dictionary! We can only calculate that by doing len(rows) // 10
. Unfortunately we've exhausted our iterator and we don't have any more data. This calls for a redesign! I actually didn't plan this out well, but seeing how these things happen in the course of coding is good illustration.
import itertools
row_template = """\
{{0:{idx_length}}}{{<1:{teamname_length}}}{{>2:{point_length}}}{{>3:{diff_length}}}{{=4:{goals_length}}}"""
with open('path/to/output.csv') as inf: # same filename we used before
reader = csv.reader(inf)
pre_process, reader = itertools.tee(reader)
# fun with pre-processing for field length!
columns = zip(*pre_process)
keys = ['teamname_length', 'point_length', 'diff_length', 'goals_length']
col_widths = {k:0 for k in keys}
for key, column in zip(keys, columns):
col_widths['idx_length'] = max([col_widths['idx_length'], len(column) // 10 + 1])
col_widths[key] = max((col_widths[key],max([len(c) for c in column)))
col_widths['idx_length'] += 1 # to account for the trailing period
row_format = row_template.format(**col_widths)
# puts those field widths in place
header = next(reader)
print(row_format("", *header)) # no number in the header!
for idx, row in enumerate(reader, start=1): # let's do it!
print(row_format("{}.".format(idx), *row))
But let's not forget that Python has an extensive selection of 3rd party modules. One does exactly what you need. tabulate
will take well-formed tabular data and spit out a pretty printed ascii table for it. Exactly what you're trying to do
Install it from pypi from command line
$ pip install tabulate
Then import in your display file and print.
import tabulate
with open('path/to/output.csv') as inf:
print(tabulate(inf, headers="firstrow"))
Or skipping straight from input to print:
import csv
import operator
import tabulate
with open('path/to/file.txt') as inf:
reader = sorted(csv.reader(inf), key=operator.itemgetter(1))
headers = next(reader)
print(tabulate([(row[0], row[1], row[2]-row[3],
"{:>2} : {:2}".format(row[2], row[3])) for row in reader],
headers=headers))
Upvotes: 10
Reputation: 1978
Just tried, and the CSV module does not care if you load a txt file.
I would use the with open ... as
method, since it's cleaner and you don't have to close f afterwards.
import csv
with open("importCSV.txt",'r') as f:
rowReader = csv.reader(f, delimiter=',')
#next(rowReader) -use this if your txt file has a header strings as column names
for values in rowReader:
print'Team:',values[0],', Points:',values[1],', Diff:',int(values[2])-int(values[3]),'Goals:',values[2]
Upvotes: 2
Reputation: 720
try casting values[2] and values[3] into int when performing subtraction:
', Diff:', int(values[2])-int(values[3])
Upvotes: 2