Reputation: 21
−3.7% [95% CI, −10.2% to 2.7%]; P = .26)
Above is an example of a string I'm using but I want to get all of the numbers in the string along with their classifiers i.e. the minus sign if it's negative, the % and the decimal point.
This string can change but the pattern for this string is consistent:
Primary Measure Value % [Confidence Interval% CI, CI Lower End% to CI Higher End%]; P= P Value)
Currently I'm using a code that pulls the numbers based on a specific index relative to substring "CI" and "P=" to pull this data but it's not 100% reliable since another string could have different number of digits and may or may not have negative numbers which means hard-coding an index number wouldn't pull the correct values.
Example of different string:
10.7% [95% CI, 1.2% to 12.7%]; P = .1)
I want to be able to assign the numeric values to the following variables including any negative signs, different digits of numbers, % sign and decimal sign.
Example of string and output desired:
string_1 = "10.7% [95% CI, 1.2% to 12.3%]; P = .1)"
Desired Output
Primary Measure Value is 10.7%
CI Lower End is 1.2%
CI Higher End is 12.3%
P Value is .1
string_2 = "−3.7% [95% CI, −10.2% to 2.7%]; P = .26)"
Desired Output
Primary Measure Value is -3.7%
CI Lower End is -10.2%
CI Higher End is 2.7%
P Value is .26
Is there a dynamic way to get the intended values of the above variables if all strings followed this same pattern?
Upvotes: 1
Views: 129
Reputation: 1376
Edit: okay here's a better solution doing it with a single regex call
import re
s = "−3.7% [95% CI, −10.2% to 2.7%]; P= .26)"
values = re.findall(r'([−.%\d]+)', s)
I use the regex module to extract the values for the desired output. The result is a list of strings containing your values.
Then you can do something like print('Primary Measure Value is', values[0])
etc.
Upvotes: 1