Goks
Goks

Reputation: 35

How to identify new line chars when file has both CR and CRLF characters

new line char and end of line both present

I need to identify the new line chars if any using powershell or batch file and if present remove.

Upvotes: 0

Views: 173

Answers (2)

mklement0
mklement0

Reputation: 440297

In a comment you state:

each record starts with DTL

It sounds like the way to fix your file is to remove any newlines that aren't followed by verbatim DTL| (the code handles both CRLF and LF-only newlines):

# Create sample file.
@'
DTL|foo1
DTL|foo2
result of an unwanted
newline or two
DTL|foo3
'@ > test.txt

# Replace all newlines not directly followed by verbatim 'DTL|' 
# with a space (remove `, ' '` if you simply want to remove the newlines).
# Pipe to Set-Content -NoNewLine in order to save to a file as needed.
(Get-Content -Raw test.txt) -replace '\r?\n(?!DTL\||\z)', ' '

Output:

DTL|foo1
DTL|foo2 result of an unwanted newline or two
DTL|foo3 

For an explanation of the regex used with the -replace operator above and the ability to experiment with it, see this regex101.com page.

Upvotes: 1

Aacini
Aacini

Reputation: 67256

I am afraid I don't really understand what you want. You didn't posted any input file nor specified what is the output you want from such an input. Anyway, I hope this code can help:

@echo off
setlocal EnableDelayedExpansion

rem Create a test file
set LF=^
%don't remove%
%these lines%

(
echo Line One: CR+LF
set /P "=Line Two: LF!LF!"
echo Line Three: CR+LF
) > test.txt < NUL

rem Read the file
set "acum=0"
(for /F "tokens=1* delims=:" %%a in ('findstr /O "^" test.txt') do (
   if not defined line (
      set "line=%%b"
   ) else (
      set /A "len=%%a-acum-2, acum=%%a"
      for %%n in (!len!) do if "!line:~%%n!" equ "" (
         echo !line!
      ) else (
         set /P "=!line!"
      )
      set "line=%%b"
   )
)) < NUL
for %%a in (test.txt) do set /A "len=%%~Za-acum-2"
(for %%n in (!len!) do if "!line:~%%n!" equ "" (
   echo !line!
) else (
   set /P "=!line!"
)) < NUL

Output:

Line One: CR+LF
Line Two: LFLine Three: CR+LF

This example first create a file with three lines, but the second one is ended in LF instead of CR+LF. Then, the program identify how each line ends and remove the alone LF's

The method is based on findstr /O switch that reports the offset of the first byte of each line starting from beginning of file

Upvotes: 0

Related Questions