Reputation: 2402
I have constructed this regex to pull the version number of a package from an automatically generated lockfile:
\[\[package\]\]\s(?:[a-z-]+ = \"?.*\"?\s)*name = \"NAME\"\s(?:[a-z-]+ = \"?.*\"?\s)*version = \"([ab0-9.]+)\"
The subject file looks like this (shortened, there are many such blocks):
[[package]]
category = "main"
description = "Some. , - description"
name = "django"
optional = false
python-versions = ">=3.5"
version = "2.2.17"
[package.dependencies]
django = ">=1.8.0"
redis = ">=3"
rq = ">=0.13,<1.0"
[package.extras]
Sentry = ["raven (>=6.1.0)"]
testing = ["mock (>=2.0.0)"]
This seems to work well. The problem is that sometimes, the ordering of the two important keys might be different, e.g.:
[[package]]
category = "main"
description = "Some. , - description"
version = "2.2.17"
optional = false
name = "django"
python-versions = ">=3.5"
Which will cause this regex to fail.
I would like to find a block (starting with [[package]]
and ending with a newline, that contains the string ^name = \"NAME\"
, and within that block, do find the value of the version
key, regardless of what order they are in.
I have done some reading around lookaheads/lookbehinds, but I could not apply it to this.
Upvotes: 1
Views: 67
Reputation: 785156
You can use a lookahead assertion to match the name of the package and capture version in main regex:
\[\[package]]\s(?=(?:[a-z-]+ = "?[^"]*"?\s)*?name = "django"\s)(?:[a-z-]+ = "?[^"]*"?\s)*?version = "([ab0-9.]+)"
RegEx Details:
\[\[package]]\s
: Match [[package]]
followed by a whitespace(?=(?:[a-z-]+ = "?[^"]*"?\s)*?name = "django"\s)
: Positive lookahead to assert that we have a property name = "django"
somewhere within this package(?:[a-z-]+ = "?[^"]*"?\s)*?
: Match 0 or more property linesversion = "([ab0-9.]+)"
: Match version
property and capture version number in capture group #1Upvotes: 2