Tomasz
Tomasz

Reputation: 5529

Multi-platform chomp working on unix, windows and mac text files

Looking for a way to chomp newline characters irrespective of the platform the files were created on.

The problem as specified by perlport#newlines is that newlines are encoded differently on each platform:

\012 unix

\015\012 windows

\015 mac

However, chomp is platform specific and will only remove the character for the platform it's running on, or anything set by the $/ variable.

So far I came up with the following regex that seems to be working:

# multiplatform chomp
s/\015?\012?$//;

Is that the correct solution or am I missing some cases and there's a better one?

Upvotes: 1

Views: 1152

Answers (3)

Schwern
Schwern

Reputation: 165278

If you really want to catch all cases, your regex is fine for stripping newlines. But it's not ok for checking if a newline is there, it will happily match a line with no newline. For that you have to spell it all out.

m{(\015|\015\012|\012)\z};

Note the use of \z. This is because $ will match a newline at the end of a line which will steal from the capture group.

Realistically, you don't need to worry about the "Mac" newline. The "Mac" newline refers to the pre-OS X MacOS. It's extremely unlikely you'll encounter a file from that era, and I say this as someone who still has a working Mac SE. So all you really need to worry about is the Windows and Unix newlines. That's typically done like so:

s{\015?\012\z}{};

Upvotes: 0

Nate Glenn
Nate Glenn

Reputation: 6744

\v matches vertical white space, so you should be able to use

s/\v+$//;

However, this assumes that you don't mind cutting off things like form feeds and vertical tabs.

Upvotes: 1

user181548
user181548

Reputation:

Why not just use

 s/\s+$//;

Upvotes: 2

Related Questions