Reputation: 27
I am new to python and i need to work on an existing python script. Can someone explain me what is the meaning of the following statement
pgre = re.compile("([^T]+)T([^\.]+)\.[^\s]+\s(\d+\.\d+):\s\[.+\]\s+(\d+)K->(\d+)K\((\d+)K\),\s(\d+\.\d+)\ssecs\]")
Upvotes: 0
Views: 86
Reputation: 42617
You need to consult the references for the exact meanings of each part of that regular expression, but the basic purpose of it is to parse the GC logging. Each parenthesized part of the expression ()
is a group that matches a useful part of the GC line.
For example, the start of the regex ([^T]+)T
matches everything up to the first "T", and the grouped part returns the text before the "T", i.e. the date "2013-08-28"
The content of the group, [^T]+
means "at least one character that is not a T"
Patterns in square brackets [] are character classes - consult the references in the comments above for details. Note that your input text contains literal square brackets, so the pattern handles those with the \[
escape sequence - see below.
I think you can simplify ([^T]+)T
to just (.+)T
, incidentally.
Other useful sub-patterns:
\s
matches whitespace\d
matches numeric digits\.
\(
and \[
match literal periods, parentheses, and square braces, respectively, rather than interpreting them as special regex charactersUpvotes: 2