Reputation: 9304
What would be the proper regex to prefix each line?
Say i have the the input data:
SOME OTHER DATA
TABLE
ROW
ROW
ROW
END
SOME OTHER DATA
Im only interested in what is between and including TABLE and END.
In php you can write a regex like the following /TABLE.*?END/s which would match the first occurence of TABLE to the first occurence of END. But is there a way i can prefix each line with %? so the result would become:
SOME OTHER DATA
%TABLE
%ROW
%ROW
%ROW
%END
SOME OTHER DATA
Any help is appreciated.
Upvotes: 0
Views: 420
Reputation: 89584
You can do it with a single replacement:
$txt = preg_replace('~^(?:TABLE\R|\G(?!\A)(?:END$|.+\R|.+\z))~m', '%$0', $txt);
Note that this pattern assume there's always a closing END "tag". If it isn't the case the replacement will continue until an empty line (cause of the +
quantifier) or the end of the string.
You can also make the choice to check if the TABLE tag is closed with an END tag:
$pattern = '~^(?:TABLE\R(?=(?:.+\R)*?END$)|\G(?!\A)(?:END$|.+\R|.+\z))~m';
First pattern details:
^ # matches the start of a line
(?: # open a non-capturing group
TABLE \R # TABLE and a newline (CR, LF or CRLF)
| # OR
\G (?!\A) # contigous to a precedent match but not
# at the start of the string
(?: #
END $ # END at the end of a line
| #
.+ \R # a line (not empty) and a newline
| #
.+ \z # the last line of the string
) # close the non-capturing group
) #
Additional lookahead details:
(?= # open the lookahead
(?:.+\R)*? # matches zero or more lines lazily
END$ # until the line END
)
An other way
$arr = preg_split('/\R/', $txt);
$state = false;
foreach ($arr as &$line) {
if ($state || $line === 'TABLE') {
$state = ($line !== 'END');
$line = '%' . $line;
}
}
$txt = implode("\n", $arr);
The behaviour of this code is the same as the first pattern, note that you obtain a string with UNIX format newlines.
Upvotes: 3
Reputation: 2621
Here you go. I did created one regex and commented it properly for you:
/(?:
#start by finding the initial position of the table start, in order to store the match position for \G
TABLE\n\K|
#after we've found the table head, continue matching using this position. make sure we arent at the beginning of the string
\G(?<!^)
)
#capture the data we're interested in
(?:
#make sure there is no 'END' in the string
(?!END)
#match everything until the line ending
.
)*
#consume the newline at the end of the string
\n/x
Replace the result with %$0
See it in action here: http://regex101.com/r/rA5bV1
--
I do recommend however, if you do not understand the regex I have created, to use an alternative method. Create a regex that would capture the contents of the table, and then just append % to every line. Use the following expression to capture the contents: /TABLE\n((?:(?!END).)*)END/
. I did not comment this, you should be able to figure it out by reading the comments of the other expression.
Upvotes: 1
Reputation: 1596
You should do it with 2 regex :
$txt = file_get_contents('input.txt');
preg_match("#(.*(?<=TABLE\n))(.*\nEND)(.*)#ms",$txt,$m);
$new = $m[1].preg_replace("#^#ms","%",$m[2]).$m[3];
print $new;
ms
modifiers make the regex act like the whole text is one line and the \n is match like a normal character with .
.
If you want to do it in only one regex, you will have to use special matching blocks like one of theses:
Hope that helps.
Upvotes: 0