Reputation: 3
I am trying to remove entries from a vertical report that looks like this.
report start : hi good morning
report (1234) hi
10/10/2013
line unequal
good morning hi good morning (123:)
20131212020202312312********
report start : hi good evening
report (1234) hi
10/10/2013
good evening hi good evening (123:)
20131212020202312312********
report start : hi good morning
report (1234) hi
10/10/2013
good evening hi good evening (123:)
20131212020202312312********
I am trying to remove complete entries where "evening" is present and "morning" is not. In short, the report should end up like this:
report start : hi good morning
report (1234) hi
10/10/2013
line unequal
good morning hi good morning (123:)
20131212020202312312********
report start : hi good morning
report (1234) hi
10/10/2013
good evening hi good evening (123:)
20131212020202312312********
I had though about concatenating everything between "**", where each line would end with the series of asterisks. They are always the same length. Then use findstr to remove entries, but how do I reconstruct the entire report? It must return to a vertical format. To add to complexity, the results are in various indentations in the txt file.
I have been unable to use "*" as a delim, and therefore, cannot introduce a for /f loop to concatenate. This is how far I've gotten.
Thanks
Upvotes: 0
Views: 124
Reputation: 70923
One more. In this case using intermediate temporary files.
@echo off
setlocal enableextensions disabledelayedexpansion
:: configure and clean ouput/temporary files
set "inputFile=inputFile.txt"
set "outputFile=outputFile.txt"
set "tempFile=%temp%\%~nx0.tmp"
break>"%tempFile%"
break>"%outputFile%"
:: retrieve end of section lines
for /f "tokens=1 delims=:" %%a in ('findstr /n /l /e /c:"****" "%inputFile%"') do set "_sect.%%a=1"
:: extract each section and test for inclusion in output file
for /f "tokens=1,* delims=:" %%a in ('findstr /n "^" "%inputFile%"') do (
echo(%%b>>"%tempFile%"
if defined _sect.%%a (
find /i "morning" "%tempFile%" >nul && ( type "%tempFile%">>"%outputFile%" )
break>"%tempFile%"
)
)
:: clean and exit
del /q "%tempFile%" 2>nul
endlocal
Upvotes: 0
Reputation: 130819
Regular expressions can be your friend :) A tool like awk or sed could work well - free Windows ports are available.
I have written REPL.BAT - a hybrid JScript/batch utility that performs a regex search and replace on stdin and writes the results to stdout. It is pure script that runs natively on any Windows machine from XP onward. Full documentation is embedded within the script.
Assuming REPL.BAT is in your current directory, or better yet, somewhere within your PATH, then all you need is the following:
type source.txt|repl "^report start :(?:[\s\S](?!morning))*?evening(?:[\s\S](?!morning))*?^\d*\*{8}\r?\n" "" m >output.txt
The above uses the M
option to enable searches across multiple lines, which requires loading the entire source file in memory. That might become problematic with really large input files. But this is still better than a pure batch solution using FOR /F, since that command also buffers the entire source file in memory.
Upvotes: 1
Reputation: 67216
@echo off
setlocal EnableDelayedExpansion
set i=0
set "morning="
set "evening="
for /F "delims=" %%a in (test.txt) do (
set /A i+=1
set "line[!i!]=%%a"
set "line=%%a"
if "!line:morning=!" neq "%%a" set morning=present
if "!line:evening=!" neq "%%a" set evening=present
if "!line:~-4!" equ "****" (
set "remove="
if defined evening if not defined morning set remove=true
if not defined remove for /L %%i in (1,1,!i!) do echo !line[%%i]!
set i=0
set "morning="
set "evening="
)
)
Upvotes: 0
Reputation: 80023
@ECHO OFF
SETLOCAL
:: make a tempfile
:maketemp
SET "tempfile=%temp%\%random%"
IF EXIST "%tempfile%*" (GOTO maketemp) ELSE (ECHO.>"%tempfile%a")
:: Process file, count sections and record section numbers to remove
SET /a section=0
CALL :init
FOR /f "delims=" %%a IN (q22151608.txt) DO (
ECHO %%a|FINDSTR "evening" >NUL
IF NOT ERRORLEVEL 1 SET found1=Y
ECHO %%a|FINDSTR "morning" >NUL
IF NOT ERRORLEVEL 1 SET found2=Y
ECHO %%a|FINDSTR /e "********" >NUL
IF NOT ERRORLEVEL 1 CALL :endsection
)
:: Re-process file, count sections
SET /a section=0
CALL :init
(
FOR /f "delims=" %%a IN (q22151608.txt) DO (
IF NOT DEFINED found1 CALL :switch
IF DEFINED found2 ECHO(%%a
ECHO %%a|FINDSTR /e "********" >NUL
IF NOT ERRORLEVEL 1 CALL :init
)
)>newfile.txt
DEL "%tempfile%a"
GOTO :EOF
:switch
SET found1=Y
FIND "#%section%#" "%tempfile%a" >NUL
IF ERRORLEVEL 1 SET found2=Y
GOTO :eof
:endsection
IF DEFINED found1 IF NOT DEFINED found2 >>"%tempfile%a" ECHO(#%section%#
:init
SET "found1="
SET "found2="
SET /a section+=1
GOTO :eof
I used a file named q22151608.txt
containing your data for my testing. Output is to file newfile.txt
Your output description does not fit with your problem definition. the line unequal
line should not appear if I've interpreted your description correctly.
It is preferable to post real data suitably censored rather than artificial data. It's not clear where a section starts and ends. Even something as simple as changing the report number of timestamp would make the supplied data clearer.
Upvotes: 1