geekchic
geekchic

Reputation: 2431

Align column names with data in a CSV file using Python

This is the code I am using to write data to a .csv file.

with open('temp.csv', 'a') as fp:
        a = csv.writer(fp, delimiter='\t')
        data = [['faceXpos','faceYpos','faceHeight','faceWidth','imageViewHeight','imageViewWidth','tshirtXpos', 'tshirtYpos','tshirtWidth','tshirtHeight'],
                [faceXpos, faceYpos, faceHeight, faceWidth, imageViewHeight, imageViewWidth, tshirtXpos, tshirtYpos, tshirtWidth, tshirtHeight]]
        a.writerows(data)

The output looks like so:

faceXpos    faceYpos    faceHeight  faceWidth   imageViewHeight imageViewWidth  tshirtXpos  tshirtYpos  tshirtWidth tshirtHeight
118 432 84  84  568 320 13.0    136 294.0   346.0
faceXpos    faceYpos    faceHeight  faceWidth   imageViewHeight imageViewWidth  tshirtXpos  tshirtYpos  tshirtWidth tshirtHeight
117.4   433.81  82.35999    82.36   568 320 14.45   134.19  288.26  340.09

How do I align it so that the data under each column is perfectly aligned in a way that is easier to read? Desired output: (even having the data at the center of a column would be fine)

  faceXpos  faceYpos    faceHeight  faceWidth   imageViewHeight imageViewWidth  tshirtXpos  tshirtYpos  tshirtWidth tshirtHeight
  118       432         84          84          568             320             13.0        136         294.0       346.0

Upvotes: 3

Views: 12572

Answers (3)

E.Z.
E.Z.

Reputation: 6661

First of all what you want is a "fixed-width file", and not a CSV.

There is a module called prettytable that could help you with that:

from prettytable import PrettyTable

# Initialize the object passing the table headers
t = PrettyTable(['A', 'B', 'C'])

t.align='l'  # align left 
t.border=False

t.add_row([111,222,333])
t.add_row([444,555,666])
t.add_row(['longer text',777,'even longer text here'])

print str(t)

Output:

 A            B    C
 111          222  333
 444          555  666
 longer text  777  even longer text here

p.s. In case you need sorting, use t.sortby as detail here

Upvotes: 8

JesterEE
JesterEE

Reputation: 156

Since prettytable has not been updated in a while, I wanted to add a nod to tabulate. I personally like this one a lot better because it plays nicely with the data science packages numpy and pandas so you can construct your data as you wish and get a table representation without much fuss.

Here's an example with numpy:

data = numpy.array([[1.,2.,3.,4.],[5.,6.,7.,8.],"here's some loooooooooooooong text".split(' ')]).T
table = tabulate.tabulate(data, tablefmt='fancy_grid', floatfmt=['0.2E', '0.4e', ''])
print(table)

╒══════════╤════════════╤═══════════════════╕
│ 1.00E+00 │ 5.0000e+00 │ here's            │
├──────────┼────────────┼───────────────────┤
│ 2.00E+00 │ 6.0000e+00 │ some              │
├──────────┼────────────┼───────────────────┤
│ 3.00E+00 │ 7.0000e+00 │ loooooooooooooong │
├──────────┼────────────┼───────────────────┤
│ 4.00E+00 │ 8.0000e+00 │ text              │
╘══════════╧════════════╧═══════════════════╛

and one with pandas:

dates = pandas.date_range('20180101', periods=4)
data = numpy.vstack([numpy.random.randn(4,4), numpy.array("here's some loooooooooooooong text".split(' '))]).T
df = pandas.DataFrame(data, index=dates, columns=list(' ABCD'))  # Blank first column for the index column
df.index = df.index.strftime('%Y-%m-%d')
df.index.name = 'dates'
df = df.reindex(['B', 'D', 'A', 'C'], axis=1)
# table = df.to_string()  # Default pandas string representation
table = tabulate.tabulate(df, tablefmt='presto', headers=[df.index.name] + df.columns.tolist(),
                          floatfmt=['', '0.4f', '', '', '0.2e'], numalign='right', stralign='center')
print(table)

   dates    |       B |         D         |                   A |         C
------------+---------+-------------------+---------------------+-----------
 2018-01-01 |  0.8080 |      here's       |  1.5430201221283801 |  8.81e-01
 2018-01-02 | -1.0354 |       some        | -1.0642628831039138 | -2.29e-01
 2018-01-03 |  1.6243 | loooooooooooooong | -0.8030183690980672 |  6.67e-01
 2018-01-04 |  0.4356 |       text        |  -1.957887025379132 | -1.37e+00

I also use it with the tablefmt='plain' keyword argument to get a fixed width formatting for a space separated values (ssv) file for easy viewing in a text editor.

Also notable is astropy.io.ascii, but in many regards, it is not as easy to use or as customizable as tabulate.

astropy.io.ascii.write(astropy.io.ascii.read(df.to_csv(), format='csv', header_start=0), format='fixed_width', formats={'C': '0.2e', 'B': '0.4f'})

|      dates |       B |                 D |               A |         C |
| 2018-01-01 |  0.8080 |            here's |   1.54302012213 |  8.81e-01 |
| 2018-01-02 | -1.0354 |              some |   -1.0642628831 | -2.29e-01 |
| 2018-01-03 |  1.6243 | loooooooooooooong | -0.803018369098 |  6.67e-01 |
| 2018-01-04 |  0.4356 |              text |  -1.95788702538 | -1.37e+00 |

Upvotes: 4

Skineffect
Skineffect

Reputation: 339

this is a bit tricky, because the CSV writer does not directly support structured and formatted output. It would be the easiest, if you write your own output routine. for example:

fp = open('temp.csv', 'a')
labelLine = list()
valueLine = list()
for label, value in zip(*data):  # unzips the list, so each elements (label and value) get extracted pairwise
    padding = max(len(str(label)), len(str(value)))  # what is longer, the label or the value?
    labelLine.append('{0:<{1}}'.format(label, padding))  # generate a string with the variable whitespace padding
    valueLine.append('{0:<{1}}'.format(value, padding))  # generate a string with the variable whitespace padding
# now write everything to the file:
fp.write('\t'.join(labelLine) + '\n')
fp.write('\t'.join(valueLine) + '\n')
fp.close()

This gives the the following output:

faceXpos    faceYpos    faceHeight  faceWidth   imageViewHeight imageViewWidth  tshirtXpos  tshirtYpos  tshirtWidth tshirtHeight
118         432         84          84          568             320             13.0        136         294.0       346.0       

However, you have to be carefull, because now your data is still \t seperated, but your CSV reader has to strip the extra whitespace. You could replace the '\t'.join by a ' '.join, if you are okay with plain whitespace seperation.

I hope that helps :) Best regards

Upvotes: 2

Related Questions