Reputation: 33
I've a requirement where I need to merge multiple CSV files into a single csv file. I tried Googling and got some info on CSVWriter or SuperCSV but I couldn't make out.
All the CSV files will contain the same columns / headers.
For better understanding:
I'm fetching 10,000 records from Database and I'm creating 10 csv files. (MyCSV_1-1000.csv
, MyCSV_1001-2000.csv
, MyCSV_2001-3000
and so on.)
Each csv file containing 1,000 records. Now I need to merge all these csv files in to the first CSV so that MyCSV_1-1000.csv
will contain all the records i.e. 1-10,000 (whereas before merging it contains only 1-1,000 records only).
Can someone help me on this.
I would like to do this in Java or any other Java supporting utility / framework.
Upvotes: 3
Views: 5715
Reputation: 13582
Programming languages can be leveraged depending on the goal/problem one has at hand.
Assuming that you need to clean the datasets before the merge, Python
has really good libraries, and I would suggest you to use pandas.
If the datasets have the same structure, create a for loop to do things like:
• Removing unnecessary rows
df.drop(df.index[[0,1,2]]) #Removing the first 3 rows
• Transpose the dataframe
transpose_dataframe = df.transpose()
• And more.
Once the cleaning process is complete, for the merge, one can also use Python. In my case, Maverick's answer generated some funny characters and didn't merge properly, so I have used the following:
import os
import csv, glob
Dir = r"C:\Users\name\Desktop\DataDirectory"
Avg_Dir = r"C:\Users\name\Desktop\Output"
csv_file_list = glob.glob(os.path.join(Dir, '*.csv'))
print (csv_file_list)
with open(os.path.join(Avg_Dir, 'Output.csv'), 'w', newline='') as f:
wf = csv.writer(f, lineterminator='\n')
for files in csv_file_list:
with open(files, 'r') as r:
next(r) # SKIP HEADERS
rr = csv.reader(r)
for row in rr:
wf.writerow(row)
Upvotes: 0
Reputation: 1434
Merging records from multiple csv files into one is simple. If multiple csv files are on the same directory, you can execute the below command from the cmd.
D:\Files>copy *.csv Merged.csv
This will create a Merged.csv file into the same directory, and will have records from all the csv's.
Upvotes: 4