Reputation: 177
I have a .DAT file with CR LF and UTF-8 format with BOM, I'm trying to convert it to CR LF UTF-8 format without BOM using Perl. I'm currently using the following code to do so and eve though the output file is generated without the BOM, the header is not included in the file with rest of the data. My requirement is to get the final output file in UTF-8 format without BOM and header included with the rest of the data.
use open qw( :encoding(UTF-8) :std ); # Make UTF-8 default encoding
sub encodeWithoutBOM
{
my $src = $_[1];
my $des = $_[2];
my @array;
open(SRC,'<',$src) or die $!;
# open destination file for writing
open(DES,'>',$des) or die $!;
print("copying content from $src to $des\n");
while(<SRC>){
@array = <SRC>;
}
foreach (@array){
print DES;
}
close(SRC);
close(DES);
}
Upvotes: 1
Views: 532
Reputation: 52374
Another option is to use File::BOM from CPAN, which lets you transparently handle the byte order mark:
#!/usr/bin/env perl
use warnings;
use strict;
use autodie;
use feature qw/say/;
use File::BOM qw/open_bom/;
sub encode_without_bom {
my ($src, $dst) = @_;
open_bom(my $infile, $src, ":encoding(UTF-8)");
open my $outfile, ">:utf8", $dst;
say "Copying from $src to $dst";
while (<$infile>) {
print $outfile $_;
}
}
encode_without_bom "input.txt", "output.txt";
Upvotes: 2
Reputation: 385799
use open ':std', ':encoding(UTF-8)';
while (<>) {
s/^\N{BOM}// if $. == 1;
print;
}
Upvotes: 2