pHorseSpec
pHorseSpec

Reputation: 1274

Difference between Parentheses for Capturing and Back References in Perl

I'm just starting to learn Perl, and I wanted to know the difference between parentheses for capturing and back references in Perl, and what situations where each one would be more useful?

When I say parentheses for capturing, I'm referring to something like the following:

if ($email =~ /([^@]+)@(.+)/) {
    print "Username is $1\n";
    print "Hostname is $2\n";
}

When I say back referencing, I'm refering to something like the following:

# (.)\1
# (.) = capture group; \1 = reference group
# (.)(.)\2\1; This pattern has 2 capture groups
# (.)(.)\g{2}\g{1}; This pattern is safer. Can't be confused w/ digits

If my syntax is incorrect w\ the back referencing, please let me know because I'm not 100% sure how the syntax works for back referencing.

Upvotes: 2

Views: 419

Answers (3)

ikegami
ikegami

Reputation: 386541

There is no difference. There aren't even two things two compare. There's just (...).

It can be used to backreference. \1 is a regular expression atom that matches what the first capture captured. This can only be used in regular expression patterns.

It can be used to capture. $1 is a Perl variable that contains what the first capture captured. This can only be in used in Perl code.

Upvotes: 1

Borodin
Borodin

Reputation: 126742

A part of a regex pattern enclosed in parentheses will be captured if the pattern as a whole matches. It is irrelevant how that captured substring is used. Captures are numbered starting from 1 from left to right in the order that their opening parenthesis appear in the regex pattern

  • It may be used as a back-reference later on inside the same pattern by using the sequence \1 etc. or (preferably) \g1 etc.

  • It may be used outside the pattern as a simple string value $1 etc. either in the replacement part of a substitution or in subsequent Perl code

You can use the captured string in both ways at once, for instance

say $1 if $str =~ /(.)\g1/;

Note that if the pattern fails to match, the value of $1 will be unchanged from the most recent successful match, so any use of $1 etc. should be conditional on the successful match of the pattern from which it is to be drawn

Upvotes: 4

cjm
cjm

Reputation: 62109

There is no difference. Parentheses capture (unless it's one of the (?...) constructs). You can use the captured text as a backreference or with capture variables ($1) (or both).

Upvotes: 2

Related Questions