Reputation: 5173
I use this code below to combine all csv files : below each file has 10,000 rows :
billing_report_2014-02-01.csv billing_report_2014-02-02.csv :
fout=open("out.csv","a")
for num in range(1,10):
print num
for line in open("billing_report_2014-02-0"+str(num)+".csv"):
fout.write(line)
for num in range(10,29):
print num
for line in open("billing_report_2014-02-"+str(num)+".csv"):
fout.write(line)
fout.close()
but now I want to add new date column to the out.csv file how can I add date column and have value of "2014-02-01" to every row that I append billing_report_2014-02-01 to out.csv, and value of "2014-02-02" to every row that I append billing_report_2014-02-02 to out.csv how can I approach this ?
Upvotes: 1
Views: 2445
Reputation: 762
I think you can just add the date at the end:
for line in open("billing_report_2014-02-0"+str(num)+".csv"):
fout.write(line+',DATE INFORMATION')
I am presuming your CSV is really comma separated, if it is tab separted the characters should be \t
you could also use an intermediate step by changing line:
line = line + ', DATE INFORMATION'
as you are trying to add the file name date just add it per variable:
line = line + ', 2014-02-0'+ str(num//10)
you could use the replace function if it is always the ",LLC" string expression, see the example below
>>> string = "100, 90101, California, Example company,LLC, other data"
>>> string.replace(',LLC',';LLC')
'100, 90101, California, Example company;LLC, other data'
>>>
putting it all together and trying to bring some of the inspiration from @Jon CLements in as well (KUDOS!):
def combine_and_add_date(year, month, startday, endday, replace_dict):
fout=open("out.csv","a")
for num in range(startday,endday+1):
daynum = str(num)
if len(daynum) ==1:
daynum = '0'+daynum
date_info = str(year+'-'month+'-'+daynum)
source_name = 'billing_report_'+date_info+'.csv'
for line in open(source_name):
for key in replace_dict:
line.replace(key,replact_dict[key])
fout.write(line+','+date_info)
fout.close()
I hope this works and you should (hopefully I am a newb...) use it like this, note the dictionary is designed to allow you to make all kinds of replacements
combine_and_add_date("2014","02",1,28, {',LLC': ';LLC', ',PLC':';PLC'})
fingers crossed
Upvotes: 2
Reputation: 142156
List the filenames you want to work on, then take the data from that, build a generator over the input file that removes trailing new lines, and adds a new field with the date... eg:
filenames = [
'billing_report_2014-02-01.csv',
'billing_report_2014-02-02.csv'
]
with open('out.csv', 'w') as fout:
for filename in filenames:
to_append = filename.rpartition('_')[2].partition('.')[0]
with open(filename) as fin:
fout.writelines('{},{}\n'.format(line.rstrip(),to_append) for line in fin)
Upvotes: 3