Reputation: 5529
Looking for a way to chomp newline characters irrespective of the platform the files were created on.
The problem as specified by perlport#newlines is that newlines are encoded differently on each platform:
\012 unix
\015\012 windows
\015 mac
However, chomp is platform specific and will only remove the character for the platform it's running on, or anything set by the $/
variable.
So far I came up with the following regex that seems to be working:
# multiplatform chomp
s/\015?\012?$//;
Is that the correct solution or am I missing some cases and there's a better one?
Upvotes: 1
Views: 1152
Reputation: 165278
If you really want to catch all cases, your regex is fine for stripping newlines. But it's not ok for checking if a newline is there, it will happily match a line with no newline. For that you have to spell it all out.
m{(\015|\015\012|\012)\z};
Note the use of \z
. This is because $
will match a newline at the end of a line which will steal from the capture group.
Realistically, you don't need to worry about the "Mac" newline. The "Mac" newline refers to the pre-OS X MacOS. It's extremely unlikely you'll encounter a file from that era, and I say this as someone who still has a working Mac SE. So all you really need to worry about is the Windows and Unix newlines. That's typically done like so:
s{\015?\012\z}{};
Upvotes: 0
Reputation: 6744
\v matches vertical white space, so you should be able to use
s/\v+$//;
However, this assumes that you don't mind cutting off things like form feeds and vertical tabs.
Upvotes: 1