DaftMythic
DaftMythic

Reputation: 21

Combine CSV files in windows (.cmd or .bat file preferably)

I may at various times have .csv files I need to combine. They have the same headers and column layout. I just need a simple way to combine them in Windows 7. The user may not always have excel installed.

A .cmd macro would be great, but the ones I found online don't work.

The best i've got so far is this:

"open a command window ("cmd.exe") and type the following two lines (no brackets)

cd "Desktop\[csv-files]"
type *.csv > my-new-file.csv"

Where the files to be combined are in Desktop\[csv-files].

BUT - it seems to create duplicates (or in some case triplicates) of the combined entries. For instance I have 2 files I tested with 23 and 26 unique entries respectivly. I got out a file with 100 entries and at least one entry repeated 3 times.

Right now the .csv files I am testing are only ~25 entries long, but in time they could be thousands or more.

Upvotes: 1

Views: 13865

Answers (1)

James L.
James L.

Reputation: 9471

Sounds like you have an issue with using *.csv and redirecting the output to a .csv file in the same folder. DOS seems to be finding the my-new-file.csv file because of the *.csv and is typing it into itself... You could use a different output filename extension until after the type command finishes, then you could rename the output file... Something like:

cd "Desktop\[csv-files]"
type *.csv > my-new-file.txt
ren my-new-file.txt my-new-file.csv

You can also skip the header of each file after the first, so that you don't end up with file headers throughout the middle of the output file. Try the following:

@echo off

setlocal ENABLEDELAYEDEXPANSION

set cnt=1
cd "Desktop\[csv-files]"

for %%i in (*.csv) do (
  if !cnt!==1 (
    for /f "delims=" %%j in ('type "%%i"') do echo %%j >> my-new-file.txt
  ) else (
    for /f "skip=1 delims=" %%j in ('type "%%i"') do echo %%j >> my-new-file.txt
  )
  set /a cnt+=1
)

endlocal

ren my-new-file.txt my-new-file.csv

Explanation:

I used ENABLEDELAYEDEXPANSION to make sure the cnt variable is properly evaluated. When delayed expansion is enabled, you use ! to distinguish variables instead of %. So to evaluate the cnt variable, you use !cnt! instead of %cnt%. Delaying expansion makes it wait to evaluate the value of cnt until the moment that it is used. Sometimes, but not always, if you use %cnt%, it will equal a value from a previous iteration. If you enable delayed expansion and use !cnt!, it will always evaluate the correct current value.

By setting cnt to 1, we can run different code for the 1st .csv file that is processed. The code includes all lines from the 1st .csv file, but skips the first line of all subsequent .csv files.

I used a nested for loop. The outer for cycles through all .csv files in the current folder. The inner for loop executes the type "%%i" command, where %%i is the name of the .csv file. Each line of the file is processed individually as %%j, which is passed to the echo %%j command. echo would normally print the value for %%j to the command prompt window. However, you can redirect the output to a file using > or >>. The > redirector overwrites the output file with the new value. The >> redirector appends the new value to the output file. Since each line of each file, and each file is being processed individually, we must use the >> redirector to push all content into a single file.

When using the for /f command, the output is broken into individual parts using the specified delimiter. The default delimiter is a space. If I didn't include "delims=", then the text This is fun would be broken into the following:

%%j = This
%%k = is
%%l = fun

We want to process the whole line from the .csv file all-at-once. By setting the delimiter to nothing ("delims="), the whole line can be processed using %%j.

For more specific help about how the for command works, type for /? at a command prompt.

endlocal reverts the environment to its state at the point where setlocal was used. Any variables you declared are removed, and extensions are set back to their prior value.

Upvotes: 3

Related Questions