Black Mamba
Black Mamba

Reputation: 245

Organize rows, columns and values in a .csv file with batch script

I made this script bellow to organize my .csv file.

My original 1.csv file is this:

Central de Relacionamento;4002 5472 (todas as localidades);0800 570 8472 (exceto capitais)
Ouvidoria;0800 570 2288 (todas as localidades);Atendimento de segunda a sexta, das 8 às 18h. Exceto Feriados Nacionais.

Estabelecimento;Previsão de pagamento;Bandeira;Forma de pagamento;Quantidade de transações;Valor bruto;Valor líquido;
1050596258;02/05/2019;Elo;Crédito parcelado loja;1;R$ 37,05;R$ 35,90;
1050596258;02/05/2019;Elo;Débito à vista;1;R$ 15,90;R$ 15,67;
1050596258;06/05/2019;Elo;Crédito parcelado loja;1;R$ 19,98;R$ 19,36;
1050596258;06/05/2019;Elo;Débito à vista;3;R$ 277,40;R$ 273,37;
1050596258;06/05/2019;Visa;Crédito parcelado loja;1;R$ 27,46;R$ 25,95;
1050596258;08/05/2019;Elo;Débito à vista;1;R$ 69,90;R$ 68,89;
1050596258;13/05/2019;Elo;Débito à vista;5;R$ 608,60;R$ 599,78;
1050596258;17/05/2019;Elo;Crédito parcelado loja;1;R$ 16,63;R$ 16,11;
1050596258;20/05/2019;Elo;Crédito parcelado loja;1;R$ 27,95;R$ 27,08;
1050596258;21/05/2019;Elo;Crédito à vista;1;R$ 95,10;R$ 93,10;
1050596258;27/05/2019;Elo;Crédito à vista;1;R$ 55,70;R$ 54,53;
1050596258;29/05/2019;Elo;Crédito parcelado loja;1;R$ 24,40;R$ 23,64;
1050596258;10/06/2019;Elo;Crédito parcelado loja;1;R$ 30,88;R$ 30,00;
1050596258;13/06/2019;Elo;Crédito à vista;1;R$ 39,60;R$ 38,77;
1050596258;28/06/2019;Elo;Crédito parcelado loja;0;R$ 24,40;R$ 23,64;
1050596258;09/07/2019;Elo;Crédito parcelado loja;0;R$ 30,86;R$ 29,98;
1050596258;29/07/2019;Elo;Crédito parcelado loja;0;R$ 24,40;R$ 23,64;
1050596258;08/08/2019;Elo;Crédito parcelado loja;0;R$ 30,86;R$ 29,98;

And i want to change it to this:

Previsão de pagamento   Bandeira     Valor líquido
02/05/2019              Elo          R$ 35,90
02/05/2019              Elo          R$ 15,67
06/05/2019              Elo          R$ 19,36
06/05/2019              Elo          R$ 273,37
06/05/2019              Vis          R$ 25,95
08/05/2019              Elo          R$ 68,89
13/05/2019              Elo          R$ 599,78
17/05/2019              Elo          R$ 16,11
20/05/2019              Elo          R$ 27,08
21/05/2019              Elo          R$ 93,10
27/05/2019              Elo          R$ 54,53
29/05/2019              Elo          R$ 23,64
10/06/2019              Elo          R$ 30,00
13/06/2019              Elo          R$ 38,77
28/06/2019              Elo          R$ 23,64
09/07/2019              Elo          R$ 29,98
29/07/2019              Elo          R$ 23,64
08/08/2019              Elo          R$ 29,98

So i made this script bellow, but the result is that:

Previsão de pagamento~Bandeira~Valor líquido
02/05/2019~Elo~R$ 35,90
02/05/2019~Elo~R$ 15,67
06/05/2019~Elo~R$ 19,36
06/05/2019~Elo~R$ 273,37
06/05/2019~Visa~R$ 25,95
08/05/2019~Elo~R$ 68,89
13/05/2019~Elo~R$ 599,78
17/05/2019~Elo~R$ 16,11
20/05/2019~Elo~R$ 27,08
21/05/2019~Elo~R$ 93,10
27/05/2019~Elo~R$ 54,53
29/05/2019~Elo~R$ 23,64
10/06/2019~Elo~R$ 30,00
13/06/2019~Elo~R$ 38,77
28/06/2019~Elo~R$ 23,64
09/07/2019~Elo~R$ 29,98
29/07/2019~Elo~R$ 23,64
08/08/2019~Elo~R$ 29,98

Follow my code:

for /f  "skip=1 tokens=2 delims=" %%a in ("3.txt") do (
     sort /+27 < "3.txt" > "4.txt"
  )

I guess the & for /f in the last line of the code isn't right. Someone can help me?

Thank you


Edit 1:

Thank you guys! Your help was very useful.

I changed my script to that bellow. I wanted now to sort it now by the Bandeira column, but i can't skip the first line Previsão de pagamento Bandeira Valor líquido.

    @echo off
    setlocal EnableDelayedExpansion

    (for /f "skip=2 tokens=2,3,7 delims=;" %%a in (1.csv) do @echo %%a~ %%b~ %%c~)>2.csv

     SET "spaces=                                                            "
    (for /f "tokens=1,2,3 usebackq delims=~" %%i in ("2.csv") DO CALL :FORMAT "%%i" 25  "%%j"  14 "%%k") > 3.txt

    del /f /s /q 2.csv >nul
    GOTO :sorting

    :FORMAT
    SET "line="
    SET /a length=0
    :formlp
    IF "%2"=="" ECHO %line%%~1&GOTO :EOF
    SET /a length+=%2
    SET "line=%line%%~1%spaces%"
    CALL SET "line=%%line:~0,%length%%%"
    shift&shift&GOTO formlp

    :sorting

for /f  "skip=1 tokens=2 delims=" %%a in ("3.txt") do (
         sort /+27 < "3.txt" > "4.txt"
      )
pause

Upvotes: 0

Views: 1423

Answers (4)

Magoo
Magoo

Reputation: 79982

SET "spaces=                                                            "
(for /f "tokens=1,2,3 usebackq delims=~" %%i in ("2.csv") DO CALL :FORMAT "%%i" 25  "%%j"  14 "%%k") > 3.csv
GOTO :EOF

:FORMAT
SET "line="
SET /a length=0
:formlp
IF "%2"=="" ECHO %line%%~1&GOTO :EOF
SET /a length+=%2
SET "line=%line%%~1%spaces%"
CALL SET "line=%%line:~0,%length%%%"
shift&shift&GOTO formlp

Replace the last line of code with the above.

Given that you have 2.csv established by your original code (but I've no idea why you need to complicate matters by using non-standard syntax) then this code will:

establish spaces as a string of many spaces.

read 2.csv, using ~ as a delimiter and setting tokens 1,2 and 3, then pass 3 "pairs" of parameters being "value" and field length to the :format subroutine (the last "pair" does not have a field length)

The :format routine simply appends each parameter, stripped of quotes to line, then appends a large number of spaces to the result, and the call set limits the accumulated field to the accumulated length, so each pair is right-padded as required.

On the last "pair", %2 is missing and hence empty, so the accumulated line is output together with the %1 parameter (-quotes) and the subroutine is exited.


For the sorting issue:

First, create the header in 3.csv, formatting to the columns required

(for /f "usebackqskip=2tokens=2,3,7 delims=;" %%i in ("1.csv") do CALL :FORMAT "%%i" 25  "%%j"  14 "%%k"&GOTO doneheader) > 3.csv
:doneheader

Then sort the content of csv.2 and append to the header already in csv.3

(for /f "tokens=1,2,3 delims=~" %%i in ('sort /+11 "2.csv"') DO CALL :FORMAT "%%i" 25  "%%j"  14 "%%k") >> 3.csv

(note no usebackq as conventional syntax is used, and >> to append in place of > to create)

Upvotes: 1

Stephan
Stephan

Reputation: 56155

@echo off
setlocal enabledelayedexpansion

(for /f "skip=2 tokens=2,3,7 delims=;" %%a in (1.csv) do (
  set "a=%%a                      "
  set "b=%%b          "
  for /f "tokens=1,*" %%m in ("%%c") do (set "c=%%m" & set "d=   %%n")
  echo !a:~0,22!!b:~0,10!!c!!d:~-7!
))>2.csv
<2.csv set /p header=
(echo %header%
more +1 2.csv |sort /+32) >3.csv

Output:

Previsπo de pagamento Bandeira  Valorlφquido
02/05/2019            Elo       R$  15,67
17/05/2019            Elo       R$  16,11
06/05/2019            Elo       R$  19,36
28/06/2019            Elo       R$  23,64
29/05/2019            Elo       R$  23,64
29/07/2019            Elo       R$  23,64
06/05/2019            Visa      R$  25,95
20/05/2019            Elo       R$  27,08
08/08/2019            Elo       R$  29,98
09/07/2019            Elo       R$  29,98
10/06/2019            Elo       R$  30,00
02/05/2019            Elo       R$  35,90
13/06/2019            Elo       R$  38,77
27/05/2019            Elo       R$  54,53
08/05/2019            Elo       R$  68,89
21/05/2019            Elo       R$  93,10
06/05/2019            Elo       R$ 273,37
13/05/2019            Elo       R$ 599,78

Upvotes: 0

aschipfl
aschipfl

Reputation: 34899

Here is a possible approach to achieve what you want – see all the explanatory rem remarks in the code:

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem // Define constants here:
set "_FILE=%~1"   & rem // (input file; `%~1` is first command line argument)
set /A "_PADL=24" & rem // (number of spaces to pad the left token 2 with)
set /A "_PADM=13" & rem // (number of spaces to pad the mid token 3 with)
set /A "_PADR=0"  & rem // (number of spaces to pad the right token 7 with)
                    rem // (`0` defines not to pad with spaces to the right)

rem // Store current code page, then change it to maintain extended characters:
for /F "tokens=2 delims=:" %%P in ('chcp') do set "$CP=%%P"
> nul chcp 437

rem // Build string of padding spaces:
set /A "NUM=_PADL|_PADM|_PADR"
set "SPC=" & setlocal EnableDelayedExpansion
for /L %%S in (1,1,%NUM%) do set "SPC=!SPC! "
endlocal & set "SPC=%SPC%"

rem // Read input file, extract certain tokens and output them right-space-padded:
for /F "usebackq skip=3 tokens=2,3,7 delims=; eol=;" %%A in ("%_FILE%") do (
    set "LEFT=%%A" & set "MID=%%B" & set "RIGHT=%%C"
    setlocal EnableDelayedExpansion
    if %_PADL% gtr 0 set "LEFT=!LEFT!%SPC%" & set "LEFT=!LEFT:~,%_PADL%!"
    if %_PADM% gtr 0 set "MID=!MID!%SPC%" & set "MID=!MID:~,%_PADM%!"
    if %_PADR% gtr 0 set "RIGHT=!RIGHT!%SPC%" & set "RIGHT=!RIGHT:~,%_PADR%!"
    echo(!LEFT!!MID!!RIGHT!
    endlocal
)

rem // Restore original code page:
> nul chcp %$CP%

endlocal
exit /B

Given the script is saved as rearrange-data.bat and the input file is called 1.csv, run the script as follows:

rearrange-data.bat "1.csv"

To write the output data to a file called 2.csv rather than to display it in the console, use this command line:

rearrange-data.bat "1.csv" > "2.csv"

Upvotes: 0

dbenham
dbenham

Reputation: 130819

This is a great opportunity to use my JREPL.BAT regular expression command line utility. It is pure script (hybrid JScript/batch) that runs natively on any Windows machine from XP onward. And it is much faster than any "pure" batch solution.

jrepl "[^;]*;([^;]*);([^;]*);(?:[^;]*;){3}([^;]*);" "$txt=rpad($1,24)+rpad($2,13)+$3" /jmatchq /f input.csv /o output.txt

The first 3 lines do not match the search expression, so the /JMATCHQ option excludes them from the output. But if you did happen to have leading lines that matched, but wanted to exclude them, then you can explicitly exclude them with the /EXC option.

jrepl "[^;]*;([^;]*);([^;]*);(?:[^;]*;){3}([^;]*);" "$txt=rpad($1,24)+rpad($2,13)+$3" /jmatchq /exc 1:3 /f input.csv /o output.txt

If I were forced to solve this with pure batch, then I would use

@echo off
setlocal enableDelayedExpansion
set "sp=                                                        "
>"output.txt" (
  for /f "usebackq skip=3 tokens=2,3,7 delims=;" %%A in ("input.csv") do (
    set "C1=%%A%sp%"
    set "C2=%%B%sp%"
    echo !C1:~0,24!!C2:~0,13!%%C
  )
)

But the above code has the following restrictions:

  • There must not be any ! within the data. This is easily solved by toggling delayed expansion on and off within the loop.
  • There must not be any empty columns, except for possibly the last column. This is extremely difficult to solve with pure batch.

Other restrictions that could come up if you try to generalize this batch technique:

  • Each input line must be <= 8191 bytes long. Not really solvable with pure batch.
  • You cannot access columns 32 or beyond. This can be solved using pure batch with a lot of obtuse, advanced code.
  • Quoted column values must not contain the ; delimiter. This is extremely difficult to solve with pure batch.

Upvotes: 0

Related Questions