Newbie
Newbie

Reputation: 376

Print csv input file into a table of columns/rows

I am trying to write a program that (among other things), prints an input file ('table1.txt') into the format seen below.

enter image description here

Where diff is the difference between the 3rd and 4th value of each line.

I think I have the basic idea figured out and I tried doing:

f = open("table1.txt",'r')

for aline in f:
    values = aline.split(',')
    print('Team:',values[0],', Points:',values[1],', Diff:',values[2]-values[3],'Goals:',values[2]) 

f.close()

But it results in a operand type error. I think I just have to change the way I iterate over the items in the file but I do not know how.

Upvotes: 2

Views: 34076

Answers (3)

Adam Smith
Adam Smith

Reputation: 54223

You should definitely use the csv module for this. It allows you to iterate through rows and values (essentially building a fancy "list of lists"). csv also has a DictWriter object that would work well to spit this data into a file, but actually displaying it is a little different. Let's look at building the csv first.

import csv
import operator

with open('path/to/file.txt') as inf,
     open('path/to/output.csv', 'wb') as outf:
    reader = sorted(csv.reader(inf), key=operator.itemgetter(1)
    # sort the original data by the `points` column

    header = ['Team', 'Points', 'Diff', 'Goals']
    writer = csv.DictWriter(outf, fieldnames=header)

    writer.writeheader()  # writes in the fieldnames
    for row in reader:
        if not len(row) == 4:
            break  # This is probably not a useful row
        teamname, points, home_g, away_g = row
        writer.writerow({'Team': teamname,
                         'Points': points,
                         'Diff': home_g - away_g,
                         'Goals': "{:>2} : {:2}".format(home_g, away_g)
                         })

This should give you a csv file (at path/to/output.csv) that has the data in the format you requested. At this point it's pretty easy to just pull the data and run print statements to display it. We can use string templating to do this nicely.

import itertools

row_template = """\
{{0:{idx_length}}}{{<1:{teamname_length}}}{{>2:{point_length}}}{{>3:{diff_length}}}{{=4:{goals_length}}}"""

with open('path/to/output.csv') as inf:  # same filename we used before
    reader = csv.reader(inf)  # no need to sort it this time!
    pre_process, reader = itertools.tee(reader)
    # we need to get max lengths for each column to build our table, so
    # we will need to iterate through twice!
    columns = zip(*pre_process)  # this is magic
    col_widths = {k: len(max(col, key=len)) for k,col in zip(
        ['teamname_length', 'point_length', 'diff_length', 'goals_length'],
        columns)}

It's worth stopping here to look at this magic. I won't go too far into the columns = zip(*pre_process) magic idiom, other than noting that it turns rows of columns into columns of rows. In other words

zip(*[[1, 2, 3],
      [4, 5, 6],
      [7, 8, 9]])

becomes

     [[1, 4, 7],
      [2, 5, 8],
      [3, 6, 9]]

after that we're just using a dictionary comprehension to build {'team_length': value, 'point_length': ...} etc that we can feed into our templater to make the field widths the right size.

But wait!

We also need idx_length in that dictionary! We can only calculate that by doing len(rows) // 10. Unfortunately we've exhausted our iterator and we don't have any more data. This calls for a redesign! I actually didn't plan this out well, but seeing how these things happen in the course of coding is good illustration.

import itertools

row_template = """\
{{0:{idx_length}}}{{<1:{teamname_length}}}{{>2:{point_length}}}{{>3:{diff_length}}}{{=4:{goals_length}}}"""

with open('path/to/output.csv') as inf:  # same filename we used before
    reader = csv.reader(inf)
    pre_process, reader = itertools.tee(reader)

    # fun with pre-processing for field length!
    columns = zip(*pre_process)

    keys = ['teamname_length', 'point_length', 'diff_length', 'goals_length']
    col_widths = {k:0 for k in keys}
    for key, column in zip(keys, columns):
        col_widths['idx_length'] = max([col_widths['idx_length'], len(column) // 10 + 1])
        col_widths[key] = max((col_widths[key],max([len(c) for c in column)))

    col_widths['idx_length'] += 1  # to account for the trailing period

    row_format = row_template.format(**col_widths)
    # puts those field widths in place

    header = next(reader)
    print(row_format("", *header))  # no number in the header!
    for idx, row in enumerate(reader, start=1):  # let's do it!
        print(row_format("{}.".format(idx), *row))

Batteries (almost) included

But let's not forget that Python has an extensive selection of 3rd party modules. One does exactly what you need. tabulate will take well-formed tabular data and spit out a pretty printed ascii table for it. Exactly what you're trying to do

Install it from pypi from command line

$ pip install tabulate

Then import in your display file and print.

import tabulate

with open('path/to/output.csv') as inf:
    print(tabulate(inf, headers="firstrow"))

Or skipping straight from input to print:

import csv
import operator
import tabulate

with open('path/to/file.txt') as inf:
    reader = sorted(csv.reader(inf), key=operator.itemgetter(1))
    headers = next(reader)
    print(tabulate([(row[0], row[1], row[2]-row[3],
                     "{:>2} : {:2}".format(row[2], row[3])) for row in reader],
                   headers=headers))

Upvotes: 10

JasTonAChair
JasTonAChair

Reputation: 1978

Just tried, and the CSV module does not care if you load a txt file.

I would use the with open ... as method, since it's cleaner and you don't have to close f afterwards.

import csv

with open("importCSV.txt",'r') as f:
    rowReader = csv.reader(f, delimiter=',')
    #next(rowReader)  -use this if your txt file has a header strings as column names
    for values in rowReader:
        print'Team:',values[0],', Points:',values[1],', Diff:',int(values[2])-int(values[3]),'Goals:',values[2] 

Upvotes: 2

Alex M
Alex M

Reputation: 720

try casting values[2] and values[3] into int when performing subtraction:

', Diff:', int(values[2])-int(values[3])

Upvotes: 2

Related Questions