Reputation: 21
I have searched for a newline inbetween two lines in a html coding... I have tried out substitution method in perl... Following is the coding i have tried...
File.txt
<body>
<p>God is great</p>
<p>He lives everywhere</p>
.......
</body>
Output:
file.html
<body>
<p>God is great He lives everywhere</p>
.......
</body>
Coding: I have searched using substitution to merge the two lines...
print "Enter the filename: ";
chomp($file=<STDIN>);
open my $in, '<', "$file.txt" or die "Can't read old file: $!";
open my $out, '>', "$file.html" or die "Can't write new file: $!";
while( <$in> )
{
s/(.+)<\/p>\n<p>/$1 /gs;
print $out $_;
}
close $in;
close $out;
But this not working How can update this????
Upvotes: 2
Views: 67
Reputation: 63922
The next script:
use 5.010;
use warnings;
my $html = do { local $/; <DATA> };
$html =~ s:</p>\n<p>: :igs;
say $html;
__DATA__
<body>
<p>par1</p><p>par2 </p>
<p> par3</p><p>par4</p>
<p> par5 </p>
<p>par6</p>
</body>
produces:
<body>
<p>par1</p><p>par2 par3</p><p>par4</p>
<p> par5 </p>
<p>par6</p>
</body>
if you change the regex to:
$html =~ s:</p>\s*<p>: :igs;
will get
<body>
<p>par1 par2 par3 par4 par5 par6</p>
</body>
and so on.
The main points:
i
- ignore case to match <p>
and <P>
g
- every occurence in the string, ands
- treat the string as an single line
one space, because if not, you will get concatenated stings, like from$html =~ s:</p>\s*<p>::igs;
this
<body>
<p>par1par2 par3par4 par5 par6</p>
</body>
Note for example the par1
and par2
.
For slurping you should change the <DATA>
to your <$filehandle>
.
Upvotes: 1
Reputation: 7912
You're reading the file one line at a time so there won't ever be a \n
to match the substitution. Instead of using a while
loop, you can read the file in one go:
my $html = do { local $/; <$in> };
Then do the substitution:
$html =~ s#</p>\n<p># #g;
print $out $html;
Notice I'm using an alternative delimiter for the substitution to avoid having to escape the /
.
Upvotes: 4
Reputation: 69450
Add an s
to your regex option and it should work:
s/(.+)</\p>\n<p>/$1 /gs;
^^
Upvotes: 1