Reputation: 21
I may at various times have .csv files I need to combine. They have the same headers and column layout. I just need a simple way to combine them in Windows 7. The user may not always have excel installed.
A .cmd macro would be great, but the ones I found online don't work.
The best i've got so far is this:
"open a command window ("cmd.exe") and type the following two lines (no brackets)
cd "Desktop\[csv-files]"
type *.csv > my-new-file.csv"
Where the files to be combined are in Desktop\[csv-files].
BUT - it seems to create duplicates (or in some case triplicates) of the combined entries. For instance I have 2 files I tested with 23 and 26 unique entries respectivly. I got out a file with 100 entries and at least one entry repeated 3 times.
Right now the .csv files I am testing are only ~25 entries long, but in time they could be thousands or more.
Upvotes: 1
Views: 13865
Reputation: 9471
Sounds like you have an issue with using *.csv
and redirecting the output to a .csv
file in the same folder. DOS seems to be finding the my-new-file.csv
file because of the *.csv
and is typing it into itself... You could use a different output filename extension until after the type
command finishes, then you could rename the output file... Something like:
cd "Desktop\[csv-files]"
type *.csv > my-new-file.txt
ren my-new-file.txt my-new-file.csv
You can also skip the header of each file after the first, so that you don't end up with file headers throughout the middle of the output file. Try the following:
@echo off
setlocal ENABLEDELAYEDEXPANSION
set cnt=1
cd "Desktop\[csv-files]"
for %%i in (*.csv) do (
if !cnt!==1 (
for /f "delims=" %%j in ('type "%%i"') do echo %%j >> my-new-file.txt
) else (
for /f "skip=1 delims=" %%j in ('type "%%i"') do echo %%j >> my-new-file.txt
)
set /a cnt+=1
)
endlocal
ren my-new-file.txt my-new-file.csv
Explanation:
I used ENABLEDELAYEDEXPANSION
to make sure the cnt
variable is properly evaluated. When delayed expansion is enabled, you use !
to distinguish variables instead of %
. So to evaluate the cnt
variable, you use !cnt!
instead of %cnt%
. Delaying expansion makes it wait to evaluate the value of cnt
until the moment that it is used. Sometimes, but not always, if you use %cnt%
, it will equal a value from a previous iteration. If you enable delayed expansion and use !cnt!
, it will always evaluate the correct current value.
By setting cnt
to 1
, we can run different code for the 1st .csv
file that is processed. The code includes all lines from the 1st .csv
file, but skips the first line of all subsequent .csv
files.
I used a nested for
loop. The outer for
cycles through all .csv
files in the current folder. The inner for
loop executes the type "%%i"
command, where %%i
is the name of the .csv
file. Each line of the file is processed individually as %%j
, which is passed to the echo %%j
command. echo
would normally print the value for %%j
to the command prompt window. However, you can redirect the output to a file using >
or >>
. The >
redirector overwrites the output file with the new value. The >>
redirector appends the new value to the output file. Since each line of each file, and each file is being processed individually, we must use the >>
redirector to push all content into a single file.
When using the for /f
command, the output is broken into individual parts using the specified delimiter. The default delimiter is a space. If I didn't include "delims="
, then the text This is fun
would be broken into the following:
%%j = This
%%k = is
%%l = fun
We want to process the whole line from the .csv
file all-at-once. By setting the delimiter to nothing ("delims="
), the whole line can be processed using %%j
.
For more specific help about how the for
command works, type for /?
at a command prompt.
endlocal
reverts the environment to its state at the point where setlocal
was used. Any variables you declared are removed, and extensions are set back to their prior value.
Upvotes: 3