Reputation: 1356
I have an ini like file where we have list of <key> = <value>
items. What complicates things is that some values are multiline and can contain =
character (tls private key).
Example:
groupid = foo
location = westus
randomkey = fbae3700c34cb06c
resourcename = example4-resourcegroup
tls_private_key = -----BEGIN RSA PRIVATE KEY-----
//stuff
-----END RSA PRIVATE KEY-----
foo = 123
faa = 223
What I have so far for pattern is this /^(.*?)\ \=\ (.*[^=]*)$/m
and it works for all keys except the tls_private_key because it contains =
so it only fetches partial value.
Any suggestions?
Upvotes: 4
Views: 123
Reputation: 18621
Another variation:
(?sm)^([^=\n]*)\s=\s(.*?)(?=\n[^=\n]*\s=\s|\z)
See proof
Explanation
--------------------------------------------------------------------------------
(?ms) set flags for this block (with ^ and $
matching start and end of line) (with .
matching \n) (case-sensitive) (matching
whitespace and # normally)
--------------------------------------------------------------------------------
^ the beginning of a "line"
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[^=\n]* any character except: '=', '\n'
(newline) (0 or more times (matching the
most amount possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
= '='
--------------------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
.*? any character (0 or more times (matching
the least amount possible))
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
\n '\n' (newline)
--------------------------------------------------------------------------------
[^=\n]* any character except: '=', '\n'
(newline) (0 or more times (matching the
most amount possible))
--------------------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
= '='
--------------------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\z the end of the string
--------------------------------------------------------------------------------
) end of look-ahead
Upvotes: 0
Reputation: 163457
You might match all the values over mulitple lines, asserting that the next line does not contain a space equals sign space:
^(.*?) = (.*(?:\R(?!.*? = ).*)*)
If the key can not have spaces:
^([^\s=]+)\h+=\h+(.*(?:\R(?![^\s=]+\h+=\h+).*)*)$
Explanation
^
Start of string([^\s=]+)
Capture group 1, match 1+ chars other than =
or a whitespace char\h+=\h+
Match an =
between spaces(
Capture group 2
.*
Match the whole line(?:\R(?![^\s=]+\h+=\h+).*)*
Repeat all following lines that do not contain a space = space)
Close capture group 2$
End of stringUpvotes: 5
Reputation: 785481
You may use this regex with a lookahead:
^\h*(?<key>[\w-]+)\h*=\h*(?<value>[\s\S]*?)(?=\R\h*[\w-]+\h*=|\z)
RegEx Details:
^
Start a line\h*
: 0 or more horizontal whitespaces(?<key>[\w-]+)
: Group key
that matches 1+ word characters or hyphens\h*
: 0 or more horizontal whitespaces=
: Match a =
\h*
: 0 or more horizontal whitespaces(?<value>[\s\S]*?)
: Group value
that matches 0 or more of any characters including newlines(?=\R\h*[\w-]+\h*=|\z)
: Lookahead to assert that at next position we have a line break followed by key and =
or there is end of inputUpvotes: 4