Reputation: 1211
I have 2 csv files. I need to replace a column in one file with a column from the other file but they have to stay sorted according to an ID column.
Here's an example:
file1:
ID, transect, 90mdist
1, a, 10,
2, b, 20,
3, c, 30,
file2:
ID, transect, 90mdist
1, a, 50
2, b, 70
3, c, 90
basically I created a new file with the correct 90mdist and I need to insert it into the old file but it has to line up with the same ID #.
It's my understanding that Python treats csv files as a string. so I can either use a dictionary or convert the data into a list and then change it? which way is best?
Any help would be greatly appreciated!!
Upvotes: 5
Views: 9401
Reputation: 2617
Once you have your csv lists, one easy way to replace a column in one matrix with another would be to transpose the matrices, replace the row, and then transpose back your edited matrix. Here is an example with your data:
csv1 = [['1', 'a', '10'], ['2', 'b', '20'], ['3', 'c', '30']]
csv2 = [['1', 'a', '50'], ['2', 'b', '70'], ['3', 'c', '90']]
# transpose in Python is zip(*myData)
transposedCSV1, transposedCSV2 = zip(*csv1), zip(*csv2)
print transposedCSV1
>>> [['1', '2', '3'], ['a', 'b', 'c'], ['10', '20', '30']]
csv1 = transposedCSV1[:2] + [transposedCSV2[2]]
print csv1
>>> [['1', '2', '3'], ['a', 'b', 'c'], ['50', '70', '90']]
csv1 = zip(*csv1)
print csv1
>>> [['1', 'a', '50'], ['2', 'b', '70'], ['3', 'c', '90']]
Upvotes: 0
Reputation: 49013
Try this:
from __future__ import with_statement
import csv
def twiddle_csv(file1, file2):
def mess_with_record(record):
record['90mdist'] = 2 * int(record['90mdist']) + 30
with open(file1, "r") as fin:
with open(file2, "w") as fout:
fields = ['ID', 'transect', '90mdist']
reader = csv.DictReader(fin, fieldnames=fields)
writer = csv.DictWriter(fout, fieldnames=fields)
fout.write(",".join(fields) + '\n')
reader.next() # Skip the column header
for record in reader:
mess_with_record(record)
writer.writerow(record)
if __name__ == '__main__':
twiddle_csv('file1', 'file2')
A couple of caveats:
Upvotes: 2
Reputation: 13244
If you're only doing this as a one-off, why bother with Python at all? Excel or OpenOffice Calc will open the two CSV files for you, then you can just cut and paste the column from one to the other.
If the two lists of IDs are not exactly the same then a simple VB macro would do it for you.
Upvotes: -1
Reputation: 41813
The CSV Module in the Python Library is what you need here.
It allows you to read and write CSV files, treating lines a tuples or lists of items.
Just read in the file with the corrected values, store the in a dictionary keyed with the line's ID.
Then read in the second file, replacing the relevant column with the data from the dict and write out to a third file.
Done.
Upvotes: 7