V. Tej
V. Tej

Reputation: 81

regex to grep a particular word while considering spaces occurring before the word

I am looking for the regex which takes the spacing issue into consideration. I have my code to do the following thing: If the class extends from base_class then just push the current class name into the array, else grep for the extended class name and push both the extended class name and also the current class name into the array.

my $key = "class " . $current_class_name . " extends";
my $variable1 = "extends base_class";
       if(/$key/){
            if(/($variable1)/){                                                             # Checking if it extends from "base_class"
                push @test_list, $current_class_name ;                                             # Pushing the test name if it extends from "base_class"
            }
            else {                                                                          # If it doesn't extend from "base_class"
                /.extends[\s]+([A-Za-z_0-9]+)/ ;
                push @test_list, $1;                                                    # Pushing the extended test name into array
                push @test_list, $current_class_name;                                              # Pushing the current test name into array                                                           
            }

        }

I have 2 questions. 1) When grep for the string $key ( if(/$key/) ) how to consider the spacing issue i.e. if we have class $current_class_name extends , indicating there are many spaces between the string class and the $current_class_name and also similarly between $current_class_name and extends. If we observe the first line of my code, we can see that it considers that there is a single space between those strings. But I want to handle the situation for any number of spaces. (1 space to 10 spaces max). So, please help me to handle this issue.

2) Similarly, when we take the word which is after extends in these line of code:

/.extends[\s]+([A-Za-z_0-9]+)/ ;
push @test_list, $1;             

How do I take the word and push it, if the extended class name occurs after many spaces after the extends string.

I hope my explanations are clear. Please comment if any part of my question is unclear. I will edit it accordingly.

Thanks

Upvotes: 1

Views: 76

Answers (2)

Nathan Loyer
Nathan Loyer

Reputation: 344

A few recommendations for you:

  • + matches 1 or more iterations of the previous character/group

  • {<number>} matches that number of iterations of the previous character/group. So {10} matches exactly 10 iterations.

  • {<number1>,<number2>} matches between number1 and number2 iterations of the previous character/group. So {1,10} matches between 1 and 10 iterations, {2,} matches 2 or more iterations, {,10} matches between 0 and 10 iterations.

  • \s matches whitespace, so tabs and spaces

  • I suggest trying out string interpolation, as it is one of my favorite things about Perl. i.e. "class $current_class_name extends" instead of "class " . $current_class_name . " extends". String interpolation works for double quotes, but not single quotes.

  • This falls under style, but I generally don't create variables if it is only going to be used in one place.

  • Always test that your regex matches before you use $1, or else it will be the result of the previous successful regex match.

Example:

if (/class\s+$current_class_name\s+extends/) {
    if (/(extends base_class)/) {
        push @test_list, $current_class_name;
    }
    elsif (/extends\s+([A-Za-z_0-9]+)/) {
        push @test_list, $1;
        push @test_list, $current_class_name;
    }
    else {
        # not sure what you want to do in this case, looks like it
        # would be a syntax error assuming this is Java
    }
}

You can change

/class\s+$current_class_name\s+extends/

to

/class\s{1,10}$current_class_name\s{1,10}extends/

if you want to keep to the 1-10 space limit. \s matches tabs too, so if you really only want to accept spaces you can change it to

/class[ ]{1,10}$current_class_name[ ]{1,10}extends/

Upvotes: 3

c3st7n
c3st7n

Reputation: 1961

1) To match whitespace you should use \s (it will match spaces or tabs) and then you can use a quantifier to control how many to match.

The below example would allow $key to match any amount of spaces around the class name (but must have at least one space):
my $key = "class\s+" . $current_class_name . "\s+extends";

2) I think your code is correct, but maybe I am misunderstanding the question. Do you only want to push it if there is more than 1 space? If so the below would work:

/.extends\s\s+([A-Za-z_0-9]+)/ ; push @test_list, $1;

Upvotes: 2

Related Questions