Reputation: 95
let me describe my problem. I have a csv file exported from excel with large amount of data. The file has a title in the first row and column headings in the second. I need to extract only two columns (2nd and 3rd) from that file, put them to 1 column and send the output to another file.
Example:
Title
colA , colB , colC , colD ,...
abc , def , ghi , jkl ,...
abc , def , ghi , jkl ,...
abc , def , ghi , jkl ,...
abc , def , ghi , jkl ,...
The thing is, that the csv parser fails when it meets the row containing a string with - ( ) @ characters. (the loop treats them as delimiter I think, so it gives me an out of range error each time).
Here is what I already have.
@Echo off & setlocal EnableExtensions
setLocal EnableDelayedExpansion
REM creating and clearing files
copy /y NUL C:\list1.csv >NUL
copy /y NUL C:\list1_tmp.csv >NUL
copy /y NUL C:\exportedColumns.csv >NUL
copy /y NUL C:\Result.txt >NUL
set Result=C:\Result.txt
set Source=C:\sourcelist.csv
set list1=C:\list1.csv
set list1_tmp=C:\list1_tmp.csv
set expCol=C:\exportedColumns.csv
REM skip 1st two lines from source file and put to output file list1
for /f "skip=2 delims=*" %%a in (%Source%) do (echo %%a >>%list1%)
REM shorten each line to 500 chars and put it to new file
for /f "tokens=* delims=" %%a in ("%list1%") do (
set s=%%a
set s=%s:~0,500%
echo.%s% >> "%list1_tmp%"
)
REM ^^^^^^^^^^^ this is not working. It puts only 1 space to the output file
rem Parsing the csv file
rem Process the file:
call :ProcessFile < %list1_tmp%
exit /B
:ProcessFile
set /P line=
:nextLine
set line=:EOF
set /P line=
if "!line!" == ":EOF" goto :EOF
set i=0
for %%e in (%line%) do (
set /A i+=1
for %%i in (!i!) do (
if %%i==1 echo %%~e >> %expCol%
if %%i==2 echo %%~e >> %expCol%
)
if %%i==3 goto nextLine
REM I don't want it to process all the columns
)
goto nextLine
I'd like to ask you to look at this and help me to get 2 columns into one and put the output to 1 file.
I'd be extremely grateful.
Upvotes: 3
Views: 4138
Reputation: 7179
How about this?
for /f "skip=2 tokens=2,3 delims=, " %i in (input.csv) do echo %i%j >> output.csv
Edit:
To replace / with newlines, you could try this:
@echo off
for /f "skip=2 tokens=2,3 delims=, " %%i in (test.csv) do call :replace %%i%%%j
goto :eof
:replace
set string=%*
For /f "tokens=1,* delims=/" %%a in ('echo %string%') Do (
echo.%%a
If not "%%b"=="" call :replace %%b)
For an input of:
title
colA , colB , colC , colD ,...
abc , def , g\hi , jkl ,...
the above would output:
defg
hi
Upvotes: 2
Reputation: 24466
It just so happens that I've been playing around with accessing CSV files as ADODB recordsets this morning. My code might be useful to you. As it is, the script loops through every .csv
file in the current directory, displaying column = value
for each row.
The JScript should be pretty easy to modify to combine the columns as you want. And since this is a batch / JScript hybrid, you can take your pick whether you want to create a Scripting.FileSystemObject
object or just redirect the output of the cscript
line to generate your new .csv
file.
Here's the code for csv.bat
. *shrug* This is not intended to be the end answer, but a suggested alternate path to try.
@if (@a==@b) @end /*
:: batch portion
@echo off
setlocal
:: force 32-bit environment for ODBC drivers
if exist "%windir%\syswow64\cmd.exe" (set "cmd=%windir%\syswow64\cmd.exe") else set "cmd=cmd.exe"
for /r %%I in (*.csv) do (
echo Processing %%~nxI:
echo;
%cmd% /c cscript /nologo /e:jscript "%~f0" "%%~dpI" "%%~nxI"
echo;
)
goto :EOF
:: JScript portion */
var conn = new ActiveXObject("ADODB.Connection");
var rs = new ActiveXObject("ADODB.Recordset");
var dsn = "Driver={Microsoft Text Driver (*.txt; *.csv)};"
+ "Dbq=" + WSH.Arguments(0) + ";"
+ "Extensions=asc,csv,tab,txt;";
try { conn.Open(dsn); }
catch(e) {
// If the Microsoft Text Driver didn't work,
// try the MS Jet 4.0 provider instead.
var dsn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+ WSH.Arguments(0)
+ ";Extended Properties=\"text;HDR=Yes;FMT=Delimited\";";
try { conn.Open(dsn); }
catch(e) {
// If that didn't work either, then give up.
WSH.Echo("Unable to create ADODB connection.");
WSH.Quit(1);
}
}
rs.Open("SELECT * from " + WSH.Arguments(1), conn, 2, 4);
while (!rs.EOF) {
for (var i=0; i<rs.Fields.Count; i++) {
WSH.Echo(rs.Fields(i).Name + ' = ' + rs.Fields(i));
}
rs.MoveNext;
}
rs.Close();
conn.Close();
Upvotes: 0
Reputation: 6657
One of the problems you've mentioned is in the line for %%e in (%line%) do ...
, when %line%
contains a special character such as (
, which naturally will mess up the interpreter.
You may be able to avoid that by using string substitution in order to put quotes around each column. For example (I'm skipping some of your code to just focus on the part with the problem):
:ProcessFile
set /P line=
:nextLine
for %%e in ("%line:,=" "%") do (
echo %%~e
)
goto nextLine
Notice this part: "%line:,=" "%"
. That is replacing all commas with " "
, as well as adding "
at the beginning and end of the line.
So if a particular line we're processing looks like this:
abc, def (foo), ghi
The for
will be expanded to this:
for %%e in ("abc" "def (foo)" "ghi") do ...
Everything is nicely contained in quotes so the (
doesn't interfere. Of course if you have quotes in a particular column, that will interfere...
The next line, where I use %%e
, I make it %%~e
in order to strip the quotes.
Upvotes: 0