Gary Washington
Gary Washington

Reputation: 107

Remove or completely supress null character \0

I have a script, MM.pl, which is the “workhorse”, and a simple “patchfile” that it reads from. In this case, the patch file is targeting an .ini file for search and replace. Simple enough. It took me 5 days to realize the ini is encoded with null (\0) characters between each letter. Since then, I have tried every option I could find both in code snippets, use:: functions, and regular expressions. The only reason I found it was I used use Data::Printer; to dump several values. In Notepad++, the ini appears to be encoded as USC-2 LE. It is important that MM.pl handles the task instead of asking the user to “fix” the issue.

Update: This may provide a clue \xFF\xFE are the first 2 characters in the ini file. They appear after processing. The swap is not actually changing anything else like it is supposed to, but "reveals" 2 hidden characters.

Upvotes: 1

Views: 1024

Answers (3)

Gary Washington
Gary Washington

Reputation: 107

To be honest this really is not a solution but a copout. After 4 weeks of trying and retrying methods, and reading and reading and reading, I have put it in park and switched to python to build the app. Several references in the perldocs mention UTF16 is "problematic" and also in mention situations it is treated differently.

Upvotes: 0

Sodved
Sodved

Reputation: 8588

When you read the file set the encoding

my $fh = IO::File->open( "< something.ini" );
binmode( $fh, ":encoding(UTF-16LE)" );

And when you output, you can write back whichever enoding you like. e.g.

my $out = IO::File->open( "> something-new.ini" );
binmode( $out, ":encoding(UTF-8)" );

Or even if you're dumping to the terminal

binmode( STDOUT, ":encoding(UTF-8)" );

Upvotes: 1

Eevee
Eevee

Reputation: 48536

As you noticed, those nulls aren't just junk to be stripped; they're part of the file's character encoding. So decode it:

open my $fh, '<:encoding(UCS-2)', 'file.ini';

Write it back out the same way once you're done.

Upvotes: 8

Related Questions