Reputation: 1833
was trying to remove the erroneous newline characters generated from Windows.
$cat -e file.xml
foo^M$
bar$
$
hello world1$
hello world2$
where there should be "foobar" without any newlines in between while all the newlines should be retained. I know within emacs we could do replacement of "^M^J" with 'RET', but I have a huge file that I don't want to open it but only wanted to use command line to convert it.
I tried dos2unix
but it only removed the "^M" part, still rendering a broken word/sentence. Also tried tr -d '\r'
and sed 's:^M$::g'
or sed 's:^M$\n:\n:g'
, all didn't work. Anyone has an idea how to do it correctly?
Upvotes: 1
Views: 1054
Reputation: 20980
Using awk
:
$ cat -e so.txt
foo^M$
bar$
line2$
line3$
$ awk 1 RS=$'\r\n' ORS= so.txt
foobar
line2
line3
$ awk 1 RS=$'\r\n' ORS= so.txt | cat -e # Just for verification
foobar$
line2$
line3$
It sets the record separator to \r\n
& prints the records with ORS=<empty string>
Upvotes: 1
Reputation: 104032
I have replicated your example file as:
$ cat -e so.txt
foo^M$
bar$
line2$
line3$
You can use Perl in 'gulp' mode to do:
$ perl -0777 -pe 's/\r\n//g' so.txt
foobar
line2
line3
The problem with using most line oriented approaches is the \r\n
is read as a line.
You can do:
$ perl -pe 's/\r\n//' /tmp/so.txt
foobar
line2
line3
as well...
Upvotes: 1
Reputation: 60017
Perhaps the following will work
sed -e 's/[\n\r]//g' old_file.txt > new_file.txt
will work
Upvotes: 0