Syl33
Syl33

Reputation: 97

GnuWin32 sed for replacing CRLF with ";" from WIndows batch file

I have this input with CRLF line endings:

1019
1020
1028
1021

I want to remove CRLF at end of each lines using sed, (or awk) from Gnuwin32 using a Windows 10 batch script, (not Powershell).

I want to get the following result inside a text file, without any semicolon or CRLF at the end:

1019;1020;1028;1021

It doesn't work with the following lines in the batch file, (it seems there is a problem with GNUwin32 sed that adds new CRLF at end of each processed line):

REM This to generate the input example :
(echo 1019& echo 1020& echo 1028& echo 1021) > test_in.txt

REM This is the first try for getting the desired 1-line output with semicolumn :
(echo 1019& echo 1020& echo 1028& echo 1021) | .\GnuWin32\bin\sed -e "s/ *$//g" | .\GnuWin32\bin\sed -e "s/\r\n/;/" > test_out.txt

REM This is the second try for getting the desired 1-line output with semicolumn :
REM (echo 1019& echo 1020& echo 1028& echo 1021) | .\GnuWin32\bin\sed -e "s/ *$//g" | .\GnuWin32\bin\sed -b -e "s/\x0d\x0a/;/g" > test_out.txt

REM This is the third try for getting the desired 1-line output with semicolumn :
REM (echo 1019& echo 1020& echo 1028& echo 1021) | .\GnuWin32\bin\sed -e "s/ *$//g" | .\GnuWin32\bin\awk "{gsub(\"\\\\r\\\\n\",\";\")};1" > test_out.txt

REM This is the fourth try for getting the desired 1-line output with semicolumn :
REM (echo 1019& echo 1020& echo 1028& echo 1021) | .\GnuWin32\bin\sed -e "s/ *$//g" | .\GnuWin32\bin\awk -v FS="\r\n" -v OFS=";" -v RS="\\$\\$\\$\\$" -v ORS="\r\n" "{$1=$1}1" > test_out.txt

Upvotes: 0

Views: 124

Answers (4)

Daweo
Daweo

Reputation: 36520

If your sed executable does support -z option

Treat the input as a set of lines, each terminated by a zero byte (the ASCII ‘NUL’ character) instead of a newline.

then you might leverage it to deal with \r\n, though be warned if there is not zero byte all file will be processed at once, so you might encounter issues whilst processing big files.

I suggest starting with

REM (echo 1019& echo 1020& echo 1028& echo 1021) | .\GnuWin32\bin\sed -z -e "s/\r\n/;/g"

I am unable to test that.

Upvotes: 0

phuclv
phuclv

Reputation: 41814

One solution with GNU tr

<file.txt tr -d '\r' | tr '\n' ';' | sed -E 's/;+$/\n/'

The last sed might be reduced to sed 's/;$/\n/' if there can't be multiple newlines at the end. But anyway this method may not work well for very large files. Another probably better solution

paste -sd ';' file.txt

But why not powershell? You just need (gc file.txt)-join';' or the full unaliased version (Get-Content .\file.txt) -join ';'

Upvotes: 1

Stephan
Stephan

Reputation: 56180

As you are on Windows, here is a pure batch solution:

@echo off
setlocal enabledelayedexpansion
del test_out.txt 2>nul
REM This to generate the input example :
(echo 1019& echo 1020& echo 1028& echo 1021) > test_in.txt

set "delimiter="
(for /f %%a in (test_in.txt) do (
 <nul set /p "=!delimiter!%%a" & set "delimiter=;"
))>test_out.txt
REM when you need a CRLF at the end of the line:
echo/>>test_out.txt

This uses a trick to write without a line ending: <nul set /p =string and redirects the whole loop in one go to the resulting file (which does access the disk only once, instead of once per line, which in turn makes it much faster on big input files (not noticeable with your mere ~100 lines though))

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203665

Using GNU awk in Unix would be (untested):

awk 'BEGIN{RS="\r\n"} {printf "%s%s", (NR>1 ? ";" : ""), $0}' file

How you call that on the command-line from Windows, I don't know, but I expect it involves escaping the existing "s (and maybe also the $ and or \s?) and changing the 's to "s. Hopefully you know or can google it since you're using that environment.

Upvotes: -1

Related Questions