Mackina
Mackina

Reputation: 83

Inserting a new column into a csv?

I realize this may be marked as a duplicate, however I would still appreciate a clear and concise response. I have spent quite a bit of time trying to use some of the code from some of the responses that were already given in other similar questions to no avail.

I have a csv file that may vary in length from time to time. The program that I have written in python will open the csv file, read it and make changes to certain fields of data. The program I have written works wonderfully and I am near the end of this project. The only issue I am facing is that I need to add two new columns. One of them has the header 'Institution' and will always contain the same data, the number 18 and another with the header 'File Type'. The 'File Type' header will always have the number 16 below it on every row.

In example:

Document Type|Name|Tax ID|Loan Number|Institution|Full Path|File Type|
 Some Info   |blah|123456|123456abcde|    18     |C:\blah  |   16    |

etc, etc, ...

So basically for every row of data that needs converted, I will always need the program to take my csv file, and insert the two new columns, Institution and File Type with the data 18 and 16, respectively, on each row that happens to exist within the csv. Make sense? So if there is 150 rows, I will need it to put it in 150 times. If the document converts 45 rows, then 45.

Here is the program that I have written and it works perfectly, other than the column part. I am not sure how to go about writing the part to add the columns.

def newDate():
    file = open("needs_convertedTest.csv","r")

    newFile = []
    for line in file:
        data = line.split(",")
        newDate = data[5][0:2]+data[5][3:5]+data[5][8:10]
        newLine = "%s,%s,%s,%s,%s,%s,%s,%s,%s" % (data[0],data[1],data[2],data[3],data[4],newDate,"18",data[6],"16")
        newFile.append(newLine)

    file = open("needs_convertedTest.csv","w")

    for line in newFile:
        file.write(line)

    file.close()

So here I am showing just the code I edited to include the new columns.

Here is the error I am getting.

Traceback (most recent call last):
File "C:\Users\delgadom.sbt\Desktop\CSV Editor\CSVDocEditor.py", line 154, in <module>
updateDocType(),newDate(),newSSN(),filePath(),newName(),newHeader()
File "C:\Users\delgadom.sbt\Desktop\CSV Editor\CSVDocEditor.py", line 21, in updateDocType
newLine = "%s,%s,%s,%s,%s,%s,%s" %(docType, data[1], data[2], data[3], data[4], data[5],"18",data[6],"16")
TypeError: not all arguments converted during string formatting

I am using Python 3.6.1.

I sincerely appreciate any and all responses.

Thank you, Max

Upvotes: 2

Views: 722

Answers (1)

A. Entuluva
A. Entuluva

Reputation: 729

It looks like your problem is trying to access data[8] when data only has seven entries. I assume you probably want data[6] instead.

As an aside, using pandas makes things quite easy:

import pandas as pd
data = pd.read_csv("needs_convertedTest.csv")
data["Institution"] = 18
data["File Type"] = 16
data.to_csv("output_file_name.csv", index=False)

Upvotes: 2

Related Questions