Reputation: 1821
I am us Qt. I have a text string that I specifically look for a function call xyz.set_name()
, I want to capture the last occurrence of this call but negate it if the line that contains it starts with a #
. So far I got the regex to match the function call but I don't know how to negate the #
matched lines and I don't know how to capture the last occurrence, don't know why all the matches are put into one capture group.
[().\w\d]+.set_name\(\)\s*
This is what I want it to do
abc.set_name() // match
# abc.set_name() // don't match
xyz.set_name() // match and capture this one
Update for more clarification:
My text read like this when printed out with qDebug
Hello\nx=y*2\nabc.set_name() \n#xyz.set_name()
It's is a long string with \n
being as newline.
Update: a longer test string for test. I have tried all the suggested regex on this but they didn't work. Don't know what is missing. https://regex101.com/r/vXpXIA/1
Update 2: Scratch my first update, the \n
is a qDebug()
thing, it doesn't need to be considered when using regex.
Upvotes: 1
Views: 293
Reputation: 110725
If you merely want to match the last line that matches the pattern
^[a-z]+\.set_name\(\)
you can use the regular expression.
(?smi)^[a-z]+\.set_name\(\)(?!.*^[a-z]+\.set_name\(\))
For simplicity I've used the character class [a-z]
. That can be changed to suit requirements. In the question it is [().\w\d]
, which can be simplified to [().\w]
.
Note that since the substring of interest is being matched there is no point to capturing it as well. The fact that one of the lines prior to the last one begins with '#'
is not relevant. All that matters is whether the lines match a specified pattern.
The PCRE regex engine performs the following operations.
(?smi) : set single-line, multi-line and case-indifferent
modes
^ : match the beginning of a line
[a-z]+\.set_name\(\) : match 1+ chars in the char class, followed
by '.set_name\(\)'
(?! : begin negative-lookahead
.*^[a-z]+\.set_name\(\) : match 0+ chars (including newlines), the
beginning of a line, 1+ letters, '\.set_name\(\)'
) : end negative lookahead
Recall that single-line mode causes .
to match newlines and multi-line mode causes ^
and $
to match the beginning and ends of lines (rather than the beginning and end of the string).
Upvotes: 1
Reputation: 627087
You may use
(?s).*\n(?!\h*#)\h*([\w().]+\.set_name\(\))
See the regex demo, your match is in Group 1. Details:
(?s)
- DOTALL mode on, .
now matches any chars.*
- any zero or more chars as many as possible\n(?!\h*#)
- a newline that is not immediately followed with 0 or more horizontal whitespaces and then a #
char\h*
- 0+ horizontal whitespaces([\w().]+\.set_name\(\))
- Capturing group 1:
[\w().]+
- 1 or more word chars, )
, (
or .
\.set_name\(\)
- a .set_name()
string.Upvotes: 0
Reputation: 1038
You need the regex lookahead operators (if your regex engine supports it). This will work.
(?(?=^[^#])(^\s*[a-zA-Z]+\.set_name\(\))|z^)
Explanation:
(?(?=patt)then|else)
- Regex if-else construct, if regex matches given pattern patt
, then
is matched, otherwise else
is matched
patt
= ^[^#]
-- at the start of the line, no #
then part - if patt
is true -- ^\s*[a-zA-Z]*\.set_name\(\)
matches any number of whitespace followed by <something>.set_name()
where something
is variable name.
else part -- If patt
is false -- match z^
which is z coming before start of line, which isn't possible.
Edit: just realised you can have digits in variable names (but it cannot start with one). In that case, improved regex (not tested)
(?(?=^[^#])(^\s*[a-zA-Z]+[a-zA-Z\d]*\.set_name\(\))|z^)
Edit: Since you also have newline characters in your string, it doesn't match the problem description in your question. Nevertheless, simple enough to deal with by just tokenising the string.
Just split up the strings based on new line.
#include <iostream>
#include <string>
#include <sstream>
#include <vector>
int main()
{
std::istringstream isr;
isr.str("I am John\n today is \n#abc.set_name()\n");
std::string tok;
std::vector<std::string> vs;
while(std::getline(isr, tok))
{
std::cout << tok << std::endl;
vs.push_back(tok);
}
for (auto r_it = vs.rbegin(); r_it != vs.rend(); ++r_it)
{
std::cout << *r_it << std::endl;
// if match then break from loop
}
}
Upvotes: 0