Murphy Kwok
Murphy Kwok

Reputation: 23

Regular Expression extracting 2 pieces of information

currently i have many html pages which i need to extract 2 pieces of information. The current expression i am using allows me to extract one information, what if i need to extract 2 pieces of data at the same time.

(?s)\A.*(var vpart=".*?";var pn).*\Z replace $1 

This is the expression i am using, i need to extract another data in the < title > tags, can some one help me to amend the above expression?

Upvotes: 2

Views: 43

Answers (1)

Ryszard Czech
Ryszard Czech

Reputation: 18611

Yes, use more groups:

(?s)\A.*(var vpart=".*?";var pn).*(var endpart=".*?";var mn).*\Z

See proof.

Replace with: $1\n$2

With more groups, add more \n$X.

Explanation

--------------------------------------------------------------------------------
  (?s)                     set flags for this block (with . matching
                           \n) (case-sensitive) (with ^ and $
                           matching normally) (matching whitespace
                           and # normally)
--------------------------------------------------------------------------------
  \A                       the beginning of the string
--------------------------------------------------------------------------------
  .*                       any character (0 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    var vpart="              'var vpart="'
--------------------------------------------------------------------------------
    .*?                      any character (0 or more times (matching
                             the least amount possible))
--------------------------------------------------------------------------------
    ";var pn                 '";var pn'
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  .*                       any character (0 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    var endpart="            'var endpart="'
--------------------------------------------------------------------------------
    .*?                      any character (0 or more times (matching
                             the least amount possible))
--------------------------------------------------------------------------------
    ";var mn                 '";var mn'
--------------------------------------------------------------------------------
  )                        end of \2
--------------------------------------------------------------------------------
  .*                       any character (0 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  \Z                       before an optional \n, and the end of the
                           string

Upvotes: 1

Related Questions