Reputation: 21
I have files with every possible EOL imaginable. I want to normalize them in one go instead of doing them one by one as we are talking a few thousand. I know how to do them manually, so please don't explain that.
I think all possible ones are, from most common to least common: CRLF, LF, CR-CRLF, CRCR-CRLF, CR, LFLF, CRCR, CRLF-CRLF and CRCRCR-CRLF (yes, there is one file).
All files have consistent EOL, there's no mixed ones in one file. There might remain some odd CR or LF after fixing, those can be left alone.
I want all files to have just CRLF. Empty lines must remain intact.
First I think I need some good detection of what EOL is in every file. It could check that it repeats at least 3 times, but some have just one line.
Here I made some scratch files, all should look like the CRLF one when it's done (there's just TXT files inside): https://www71.zippyshare.com/v/BNpRAijy/file.html
I Googled for the whole day and didn't find any good solution.
Examples
1. just CRLF EOL, result I want from all:
line1CRLF
line2CRLF
CRLF
line3CRLF
line4CRLF
CRLF
CRLF
line5CRLF
CRLF
CRLF
CRLF
line6CRLF
CRLF
2. CRCRLF: Manually I would replace CRCRLF with CRLF, \r\r\n with \r\n and repeat again for files with CRCRCRLF and again for that lonely CRCRCRCRLF. But problem is not all files have just this possibility, there are 5 more to consider which I listed above. Though just LF and just CR is not so problematic here as Windows Notepad now supports Unix and MAC EOL, but it would still be nice to include them.
So main problem remains LFLF and then there's also those few CRCR and CRCR-CRLF to consider. Best would be to include all possibilites.
line1CR
CRLF
line2CR
CRLF
CR
CRLF
line3CR
CRLF
line4CR
CRLF
CR
CRLF
CR
CRLF
line5CR
CRLF
CR
CRLF
CR
CRLF
CR
CRLF
line6CR
CRLF
CR
CRLF
Upvotes: 2
Views: 1277
Reputation: 91528
With Notepad++, you can do:
\R+
\r\n
Where \R+
stands for 1 or more any kind of linebreak.
ScreenShot:
Upvotes: 1