station
station

Reputation: 7145

How to write a batch script in windows to loop over files , find a pattern and replace it

I have to write a batch script that loops over files and replaces stuff. Here is a sample data from the file.

1068        1181408                    META       METADATA   20150618201505211
20400693                                                                                                                                                                                                                                                                                                                                    400693    
30H13UC          23       00
4010 618114915
4020 3
4030 0455
4040 400
4050 0029
4070 ROck
4080 XX SMALL
4090 Worley Stone

Now I need to find the Number starting with 20 and replace the next digits frm 3rd position with 10101.

Eg: In the file the 1st number stating with 20 is the 2nd line after the line beginning with 1068.

20400693 -> 2010101

and also in 340th position in the same line.

in the same line the number in 340th positon is 400693

400693 -> 10101

This pattern may or may not occur multiple times in same file

Now I can loop over the files like

for /r %i in (*)

But how do I write out the replacement part.

Upvotes: 0

Views: 1656

Answers (4)

dbenham
dbenham

Reputation: 130819

Your spec is a bit imprecise - position of 40 string is not as stated, and you don't state whether the spacing of the replacement line matters.

Since you tagged your question with , I think you will be interested in my JREPL.BAT regular expression text processing utility. It is pure script (hybrid JScript/batch) that runs natively on any Windows machine from XP onward.

This first solution simply replaces the digits following 20 and 40 with the new string, disregarding original string length. So the position of the 40 string may change (does change in your example).

@echo off
for /r %%F in (*) do call jrepl "^(1068 .*\n20)\d+( +40)\d+ *$" "$110101$210101" /m f "%%F" /o -

Here is a more complicated solution that preserves the position of the 40 string (position 332 in your example)

@echo off
for /r %%F in (*) do call jrepl "^(1068 .*\n20)(\d+ +)40\d+ *$" "$1+'10101'+Array($2.length-5+1).join(' ')+4010101" /m /j /f "%%F" /o -

This final solution assumes the line is formatted with fixed width, and both the 20 and 40 numbers have maximum length of 10. This solution preserves both the position of the numbers, and the total length of the line:

@echo off
for /r %%F in (*) do jrepl jrepl "^(1068 .*\n20)\d+ *( {322}40)\d+ *$" "$110101   $210101   " /m /f "%%F" /o -

Upvotes: 1

Aacini
Aacini

Reputation: 67216

The method below assume that there are not empty lines in the files. This point may be fixed, if needed.

@echo off
setlocal EnableDelayedExpansion

rem Set working values
set "find=20"
set "replace=10101"

rem Process all files in current folder and below it
for /F "delims=" %%a in ('dir /A-D /S /B *.*') do (

   rem Read this file via redirected input
   rem and create a .tmp extension copy of it via redirected output
   < "%%a" (

      rem Read the first line
      set /P "line="
      set lastLine=1

      rem Find the number of the lines that start with "20"
      for /F "tokens=1,2 delims=: " %%b in ('findstr /N "^%find%" "%%a"') do (

         rem Copy the lines before this one
         set /A lines=%%b-lastLine, lastLine=%%b
         for /L %%i in (1,1,!lines!) do set /P "line=!line!" & echo/

         rem Process this line as desired:
         rem Get the first token in this line
         set "token=%%c"
         rem Get the pattern to replace removing "20" from beginning of the token
         rem and replace it in the entire line
         for /F %%d in ("!token:*%find%=!") do set "line=!line:%%d=%replace%!"

      )

      rem Copy the last replaced line
      echo !line!

      rem Copy the rest of lines after the last replaced one
      findstr "^"

   ) > "%%~Na.tmp"

   rem Replace the original file by the processed one
   move /Y "%%~Na.tmp" "%%a" > NUL

)

Upvotes: 0

Magoo
Magoo

Reputation: 80023

@ECHO OFF
SETLOCAL
:: The directory to look for data files and to place processed files
SET "sourcedir=U:\sourcedir\t w o"
SET "destdir=U:\destdir"
:: the start of the line, and length-to-match
SET "replaceinlines=20"
SET /a lengthofmatch=2
:: Replacement text, length-to-replace, column-for secondary-replacement
SET "replaceby=10101"
SET /a replacelength=6
SET /a replacecolumn=332
:: Replace-only-if-match ?
SET "replaceifmatch=Y"
:: calculate length of second-segment-to-preserve and its start-position
SET /a seg2start=replacelength+lengthofmatch
SET /a seg2=replacecolumn-seg2start
SET /a seg3start=replacecolumn+replacelength
::
FOR /f "tokens=1*delims=" %%a IN (
  'dir /b /a-d "%sourcedir%\*" '
  ) DO (
 FOR /f "usebackqdelims=" %%x IN ("%sourcedir%\%%a") DO SET "line=%%x"&call:process
) >"%destdir%\%%a"

GOTO :EOF

:process
:: does the start-of-line match?
CALL SET "startofline=%%line:~0,%lengthofmatch%%%"
IF "%startofline%" neq "%replaceinlines%" GOTO report
:: matched start-of-line; pick up data-to-replace
CALL SET "data1=%%line:~%lengthofmatch%,%replacelength%%%"
CALL SET "data2=%%line:~%replacecolumn%,%replacelength%%%"
::
:: Not sure about this - replace-both-regardless or replace-if-data-matches
::
IF "replaceifmatch"=="Y" IF "%data1%" neq "%data2%" GOTO report
CALL SET "line=%startofline%%replaceby%%%line:~%seg2start%,%seg2%%%%replaceby%%%line:~%seg3start%%%"

:report
ECHO(%line%
GOTO :eof

You would need to change the setting of sourcedir and destdir to suit your circumstances. Produces a new file with the same filename as the source in the destination directory. U: is my test drive.

Patching your supplied data yielded the target 400693 at column 332, ot 340 as claimed.

The pattern to match at the start of the lines is placed in replaceinlines and its length in lengthofmatch

The length of the text-to-be-replaced is 6 (replacelength) but you have a replacement string of length 5.?? (replaceby)

I look at the line as havng 4 segments - the first is the 20 and the following 6 characters, the second the space between that and the second 'to be replaced' string; the last (which I named seg3 but should be seg4 is the part which follows the second 'to-be-replaced' string.

You don't say whether the replacement is to take place only if the two 'to-be-replaced' strings match or regardless, so I supplied a switch replaceifmatch - Y means "if the two match, replace both". Setting replaceifmatch to something else will replace regardless.

Beyond that, it's a simple matter of calculating the column-positions and lengths from the data provided and using call set to apply the calculated values to the strings of interest.

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626802

You can use Windows Scripting Host to get what you want.

Create a file called say, "1.wcf", and copy/paste the following:

<job>
    <script language="JavaScript">
        var fso = new ActiveXObject("Scripting.FileSystemObject");  
        var files = new Enumerator(fso.getFolder(".").files);
        var count = 0;
        for (; !files.atEnd(); files.moveNext())
        {
            var file = ""+files.item(); // make it string
            if (!file.match(/.*\.txt$/))
                { continue; WScript.echo("Found itself, skipping"); }
            //WScript.echo("Replacing in " + file);
            var f1 = fso.OpenTextFile(file, 1);
            var text = f1.ReadAll();
            f1.close();
            var lines = text.split("\r\n");
            for (var i = 0; i < lines.length; i++)
            {
              var m = lines[i].match(/^20(\d+)/);
              if (m)
              {
                lines[i] = lines[i].replace(new RegExp(m[1], "g"), '10101');
                //WScript.echo("Replaced in " + lines[i]);
              }
            }
            var f2 = fso.OpenTextFile(file, 2);
            f2.Write(lines.join("\r\n"));
            f2.close();
        }
        WScript.echo("Replaced "+count+" files");
    </script>
</job>

Then, copy this file into the folder with TXT files, and run. It will process each TXT, and if a line in the TXT file starts with 20, the rest of the adjoining digits are captured into Group 1, and then are used to replace all such digit sequences on that line.

Then, the file is re-written with the updated contents.

Upvotes: 1

Related Questions