Adding new data column to a csv file in python

Question

I am trying to do two things...

Add some 'metadata' columns to the file
Create a dictionary where the file header is keyed to the value

Here is a sample of the file...

date,id,type,ccy,mtm,base,20151015,20151016,20151019,20151020,20151021,20151022
20161209,118,book,cad,-2493980,0,3845,1902,-1130,6052,-5594,-1419
20161209,118A188,bond,cad,-407954,0,5,11,5,23,3,-7
20161209,118A193072,bond,cad,111227,0,-296,-155,73,-429,410,95
20161209,118A217,bond,cad,4058,0,0,0,0,-2,1,0

For 1) I can add columns to the header but the append to them seems to be appending across the header rather than under the header...I am adding to the header like...

import os import csv

data = '/home/usr/local/bin/dev/remote/xx-yyy-zzz-118-sample.csv'

file_path = (os.path.dirname(data))
portfolio = ((os.path.basename(data)).strip().split('-')[3])

with open(data, 'r') as f:
    data = []

    r = csv.reader(f, delimiter = ',')

    header = next(r)
    header.append('portfolio')
    portfolio_col = next(r)
    header.append('file_path')
    file_path_col = next(r)
    print(header)

Old header...

['date', 'id', 'type', 'ccy', 'mtm', 'base', '20151015', '20151016', '20151019', '20151020', '20151021', '20151022']

New header...

['date', 'id', 'type', 'ccy', 'mtm', 'base', '20151015', '20151016', '20151019', '20151020', '20151021', '20151022', 'portfolio', 'file_path']

I am trying to append the column like this but it is not working as I want...

for row in f:
      portfolio_col.append(portfolio)
     file_path_col.append(file_path)
     data.append(portfolio_col)
print(data)

It is iterating (sort of) through the rows but not the way I intended...

[['20161209', '118', 'book', 'cad', '-2493980', '0', '3845', '1902', '-1130', '6052', '-5594', '-1419', '118', '118'], ['20161209', '118', 'book', 'cad', '-2493980', '0', '3845', '1902', '-1130', '6052', '-5594', '-1419', '118', '118']]

I am also struggling to create a dictionary on the header keys but this question is already long enough that I will likely ask another separate question...

Separately I was trying to do this but I am having issues with the header as a list...

with open(filename,'r') as f:
    header=f.readline().strip().split(',')    
    data = []
    for line in f:

        values=line.strip().split(',')
        row=dict()
        for i,h in enumerate(header):
            row[h]=values[i]

        data.append(row)

PabTorre · Accepted Answer

You can approach the problem with pandas. that way adding a column to the file and appending become trivial tasks.

import os 
import pandas as pd
data = '../remote/xx-yyy-zzz-118-sample.csv'

df = pd.read_csv(data)
file_path = (os.path.dirname(data))
portfolio = ((os.path.basename(data)).strip().split('-')[3])

df['file_path'] = file_path
df['portfolio'] = portfolio
df.to_csv(data)
print(df)

       date          id  type  ccy      mtm  base  20151015  20151016  \
0  20161209         118  book  cad -2493980     0      3845      1902   
1  20161209     118A188  bond  cad  -407954     0         5        11   
2  20161209  118A193072  bond  cad   111227     0      -296      -155   
3  20161209     118A217  bond  cad     4058     0         0         0   

   20151019  20151020  20151021  20151022  file_path portfolio  
0     -1130      6052     -5594     -1419  ../remote       118  
1         5        23         3        -7  ../remote       118  
2        73      -429       410        95  ../remote       118  
3         0        -2         1         0  ../remote       118

Adding new data column to a csv file in python

Answers (2)

Related Questions