Reputation: 87
I have the following text layout:
Heading
Chapter 1:1 This is text
2 This is more text
3 This is more text
4 This is more text
5 This is more text
6 This is more text
7 This is more text
8 This is more text
9 This is more text
10 This is more text
11 This is more text
12 This is more text
13 This is more text
14 This is moret text
15 This is more text
Heading
Chapter 2:1 This is text
2 This is more text...
and I am trying to add the first Chapter reference and the last one in that Chapter right after the Heading, written in parentheses. Like so:
Heading (Chapter 1:1-15)
Chapter 1:1 This is text
2 This is more text
3 This is more text
4 This is more text
5 This is more text
6 This is more text
7 This is more text
8 This is more text
9 This is more text
10 This is more text
11 This is more text
12 This is more text
13 This is more text
14 This is moret text
15 This is more text
I've come up with this regular expression so far:
~s/(?s)(Heading)\r(^\d*\w+\s*\d+:\d+|\d+:\d+)(.*?)(\d+)(.*?\r)(?=Heading)/\1 (\2-\4)\r\2\3\4\5/g;
but this is grabbing the first number right after Chapter 1:1 (i.e. "2", "Heading (Chapter 1:1-2)"), instead of the last one ("15" as in "Heading (Chapter 1:1-15)"). Could someone please tell me what's wrong with the regex? Thank you!
Upvotes: 2
Views: 502
Reputation: 9322
Edit for updated question
Here's a regex with explanation that will solve your problem. http://codepad.org/mSIYCw4R
~s/
((?:^|\n)Heading) #Capture Heading into group 1.
#We can't use lookbehind because of (?:^|\n)
(?= #A lookahead, but don't capture.
\nChapter\s #Find the Chapter text.
(\d+:\d+) #Get the first chapter text. and store in group 2
.* #Capture the rest of the Chapter line.
(?:\n(\d+).+)+ #Capture every chapter line.
#The last captured chapter number gets stored into group 3.
)
/$1 (Chapter $2-$3)/gx;
Upvotes: 2
Reputation: 3744
An implementation of @FMc's comment could be something like:
#!/usr/bin/perl
use warnings;
use strict;
my $buffer = '';
while (<DATA>) {
if (/^Heading \d+/) { # process previous buffer, and start new buffer
process_buffer($buffer);
$buffer = $_;
}
else { # add to buffer
$buffer .= $_;
}
}
process_buffer($buffer); # don't forget last buffer's worth...
sub process_buffer {
my($b) = @_;
return unless length $b; # don't bother with an unpopulated buffer
my($last) = $b =~ /(\d+)\s.*$/;
my($chap) = $b =~ /^(Chapter \d+:\d+)/m;
$b =~ s/^(Heading \d+)/$1 ($chap-$last)/;
print $b;
}
__DATA__
Heading 1
Chapter 1:1 This is text
2 This is more text
3 This is more text
4 This is more text
5 This is more text
6 This is more text
7 This is more text
8 This is more text
9 This is more text
10 This is more text
11 This is more text
12 This is more text
13 This is more text
14 This is moret text
15 This is more text
Heading 2
Chapter 2:1 This is text
2 This is more text...
3 This is more text
Upvotes: 2