Reputation: 1521
I have the following string
s = "ΔG'° = (-19.9 +/- 0.4) kilojoule / mole"
I'd like to generate a dictionary like the following
d = {"mean"= -19.9, "sd": 0.4, "units": "kilojoule / mole"}
If the string is -19.9 +/- 0.4
I could do s.split("+/-")
. But in the given format, I have
to split several times based on each delimiter.
Is there an easy way of doing this?
Upvotes: 2
Views: 48
Reputation: 626738
You can use
r'=[^\d-]*(?P<mean>-?\d*\.?\d+)\s*\+/-\s*(?P<sd>\d*\.?\d+)\W+(?P<units>.+)'
See the regex demo. Details:
=
- a =
sign[^\d-]*
- zero or more chars other than digit and -
(?P<mean>-?\d*\.?\d+)
- Group "mean": an optional -
, zero or more digits, an optional .
and then one or more digits\s*\+/-\s*
- a +/-
substring enclosed with zero or more whitespaces(?P<sd>\d*\.?\d+)
- Group "sd": zero or more digits, an optional .
and then one or more digits\W+
- one or more non-word chars(?P<units>.+)
- Group "units": the rest of the string.See the Python demo:
import re
rx = r'=[^\d-]*(?P<mean>-?\d*\.?\d+)\s*\+/-\s*(?P<sd>\d*\.?\d+)\W+(?P<units>.+)'
text = r"ΔG'° = (-19.9 +/- 0.4) kilojoule / mole"
m = re.search(rx, text)
if m:
print(m.groupdict())
# => {'mean': '-19.9', 'sd': '0.4', 'units': 'kilojoule / mole'}
Upvotes: 2