user1622089
user1622089

Reputation: 17

Parsing first row of csv into sql table using batch file

I have some csv files, I need to open csv file, read first line of csv and convert it into temporary sql table, and then load data into the sql table as follows:

Read the lines of the CSV and for each line:
Break it into fields create one temporary sql table Insert those fields into a row of the database table

I tried something like this

This script is now divided in 4 parts,file initialization; file creation, process and copy data, everything is working fine except,on fil.sql I am getting output as

                       CREATE TEMP TABLE temtab(
                        firstcolumn character varying (255),
                         secondcolumn character varying (255),
                            lastcolumn character varying (255),
                            );
                         \COPY temtab from bio.csv WITH DELIMITER ; csv HEADER

While I want without comma for last col

                       CREATE TEMP TABLE temtab (
                       firstcolumn character varying (255),
                       secondcolumn character varying (255),
                         lastcolumn character varying (255)
                           );
                   \COPY temtab from bio.csv WITH DELIMITER ; csv HEADER







            @echo off
            ::setlocal enabledelayedexpansion
            REM Assiging dir to current directory
              SET dir=%CD%
              REM Defining database name
               SET dbname=****
               REM Defining Host name
                 SET host=****
               REM Defining user
                 SET user=****
                 REM Defining Port
         SET port=****
               REM SQL file where query is to be executed
                 SET sqfile=fil.sql

           SET fi=bio.csv
                 call:fileinitialization

                call:filecreation 

                call:proces

                  call:copydata
            goto:eof

         :fileinitialization
          REM Assigning name of temporary table 
                  SET tabnam=temtab
             REM  Setting delimiter to variable delim
             SET delim=;
        REM Declaring variable numfields to store index of variable names array  
    set numFields=0
    echo para setted
    set fi=bio.csv
    SET tex=text
    SET com=,
                GOTO:EOF

           :filecreation 
          REM Setting create temporary table command with table name tabnam
             SET creat=CREATE TEMP TABLE %tabnam%
             echo %creat%

                     GOTO:EOF 

                :proces
                REM Executing loop for each file in current directory
          echo %creat%>fil.sql
        REM Read the lines of the CSV file
        For /F  "eol==" %%A in (bio.csv) Do ( set "line=%%A" 


                REM check if index of array is 0
                     if !numFields! equ 0  (
                      REM Fisrt line, Store in array name
                         for %%B in (!line: ^=!) do (
                echo %%B character varying (255^),>>fil.sql   

                        set /A numFields+=1
                    set name[!numFields!]=%%B
                    ) ) )





            GOTO:EOF

             :copydata
           echo \COPY %tabnam% from %fi% WITH DELIMITER %delim% csv HEADER
            echo \COPY %tabnam% from %fi% WITH DELIMITER %delim% csv HEADER;>>fil.sql
                  GOTO:EOF  
              ::endlocal
                     Pause

Upvotes: 1

Views: 3534

Answers (1)

Aacini
Aacini

Reputation: 67216

Although I don't know the format of SQL tables, I can show you how to read a CSV file. The Batch file below read all lines from the file; it first take field names from first line (CSV header) and create an array of variable names (eliminating possible spaces in field names); then it read the rest of lines and assign each field value to its corresponding Batch variable.

ProcessCSV.BAT:

@echo off
rem General-purpose CSV file reader program
rem Antonio Perez Ayala

setlocal EnableDelayedExpansion
set numFields=0
rem Read the lines of the CSV file
for /F "delims=" %%a in (CSVfile.csv) do (
   set "line=%%a"
   if !numFields! equ 0 (
      rem It is the first line: break it into an array of field names (removing spaces)
      for %%b in (!line: ^=!) do (
         set /A numFields+=1
         set name[!numFields!]=%%b
      )
   ) else (
      rem Replace spaces by Ascii-128 (to avoid split values that may have spaces)
      set "line=!line: =Ç!"
      rem Insert any char. at beginning of each field, and separate fields with spaces
      set i=0
      for %%b in (X!line:^,^= X!) do (
         set "field=%%b"
         rem Recover spaces in this field, if any
         set "field=!field:Ç= !"
         rem And assign it to corresponding variable (removing first character)
         set /A i+=1
         for %%i in (!i!) do set "!name[%%i]!=!field:~1!"
      )
      rem At this point all variables have the values of current record.
      rem They may be accessed explicitly (ie, from example CSVfile.csv):
      echo/
      echo Record of !FirstName! !LastName!
      rem ... or implicilty via the NAME array:
      for /L %%i in (3,1,!numFields!) do (
         for %%b in (!name[%%i]!) do echo    %%b: !%%b!
      )
   )
)

CSVfile.csv:

First Name,Last Name,Address,Postal Code,Company,Departament,Floor,Phone,Mobile
John,Smith,123 Fake Street,45612,SomeCo,Accounting,4,123-555-5555,123-555-5556
Jane,Doe,123 Fake Street,,SomeCo,,4,123-555-5555,123-555-5556

output:

Record of John Smith
   Address: 123 Fake Street
   PostalCode: 45612
   Company: SomeCo
   Departament: Accounting
   Floor: 4
   Phone: 123-555-5555
   Mobile: 123-555-5556

Record of Jane Doe
   Address: 123 Fake Street
   PostalCode:
   Company: SomeCo
   Departament:
   Floor: 4
   Phone: 123-555-5555
   Mobile: 123-555-5556

Please be aware that this program use several advanced Batch techniques. I suggest you to get help on every command you don't completely understand (ie: SET /?) and read it carefully. If after this process you have further questions about this program, just post they as an edit in your original question.

The most complex part of this program is responsible to assign empty strings to variables when the corresponding field is empty (two commas side by side); if the file have not empty fields, the program may be somewhat simpler. Also, this program (as most Batch solutions) may give erroneous results if certain special Batch characters appear in the file, like !. Most of these characters may be managed if required via certain modifications in the program.

EDIT: Modified version when no empty fields exists

@echo off
rem CSV file reader program when no empty fields exist
rem Antonio Perez Ayala

setlocal EnableDelayedExpansion
set numFields=0
rem Read the lines of the CSV file
for /F "delims=" %%a in (CSVfile.csv) do (
   set "line=%%a"
   if !numFields! equ 0 (
      rem It is the first line: break it into an array of field names (removing spaces)
      for %%b in (!line: ^=!) do (
         set /A numFields+=1
         set name[!numFields!]=%%b
      )
   ) else (
      rem Replace spaces by Ascii-128 (to avoid split values that may have spaces)
      set "line=!line: =Ç!"
      rem Separate fields (using comma as standard Batch separator)
      set i=0
      for %%b in (!line!) do (
         set "field=%%b"
         rem Assign this field to corresponding variable, recovering spaces
         set /A i+=1
         for %%i in (!i!) do set "!name[%%i]!=!field:Ç= !"
      )
      rem At this point all variables have the values of current record.
      rem They may be accessed explicitly (ie, from example CSVfile.csv):
      echo/
      echo Record of !FirstName! !LastName!
      rem ... or implicilty via the NAME array:
      for /L %%i in (3,1,!numFields!) do (
         for %%b in (!name[%%i]!) do echo    %%b: !%%b!
      )
   )
)

Please note that the standard separators in FOR sets are comma, semicolon and equal-sign, besides spaces:

for %a in (one two,three;four=five) do echo %a

Previous program replace spaces by another character and use commas to separate fields. However, if the line may contain semicolons or equal-signs the fields will be splitted at that point, so in this case these characters must be changed for another ones before the FOR and recovered later, in the same way of the space.

EDIT: Modifications for new request (eliminate last comma)

Eliminating the last comma is not trivial, although not too complex neither. I hope my method be easy to understand; it is based on SET /P command behaviour of show text (the input prompt) with NO new line at end; note that the format is SET /P =text>>out<NUL. The <NUL part is needed so the SET /P will NOT wait for input; don't leave spaces before the < (the same as >>). However, this behaviour do NOT work in Windows Vista an later versions, I think. If the method don't work for you, then it must be modified again...

I also moved ahead and include some remarks about the parts that still are missing in your code (I think), that is, the processing of several files.

:proces
REM Executing loop for each file in current directory
REM *This may be done with a FOR loop:*
::*for %%F in (*.csv) do (*
    REM *The file name is given by %%F. In this case, the fileinitialization part*
    REM *must be done here, for example:*
    set numFields=0
    echo %creat%>fil.sql
    REM Read the lines of the CSV file
    For /F "eol==" %%A in (bio.csv) Do ( 
        set "line=%%A" 
        REM check if index of array is 0
        if !numFields! equ 0  (
            REM First line, Store in array name
            for %%B in (!line: ^=!) do (
                REM Note that I changed the place of the ECHO command
                set /A numFields+=1
                set name[!numFields!]=%%B
                if !numFields! equ 1 (
                    REM First field: show it with NO comma and NO NEW LINE
                    set /P =%%B (text^)>>%sqfile%<NUL
                ) else (
                    REM Next fields: complete the comma of previous field, WITH NEW LINE
                    echo ,>>%sqfile%
                    REM ... and show this field with NO comma and NO NEW LINE (again)
                    set /P =%%B (text^)>>%sqfile%<NUL
                )
            )
            REM Insert the new line of last field (that have NOT comma :-)
            echo/>>%sqfile%
        )
    )
::*)*
GOTO:EOF

:copydata

I strongly encourage you to keep my previous format: 4 justification columns inside each block of code enclosed in parentheses and place the closing parentheses in the same column of the opening command, FOR or IF. This format will help you to easily locate errors cause by mismatched parentheses in large programs.

Antonio

Upvotes: 1

Related Questions